Two bugs caused multiple articles to land on the same publish slot:
1. main.py: asyncio.create_task() returned immediately, allowing a second
pipeline trigger (N8N + Telegram /run or two N8N calls) to start a
second concurrent run. Added asyncio.Lock (_pipeline_lock) so any
second trigger while the pipeline is running is rejected immediately.
2. scheduler.py: reserve_publish_slot() read the list of occupied slots
and wrote the new slot in two separate DB connections. Concurrent threads
could both see the same "free" slot before either committed its write.
Fixed by wrapping the entire read-find-write cycle in a threading.Lock
(_slot_lock) and a single DB connection, so the slot check and the
slot assignment are atomic.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
source_extraction.py:
- New _extract_image_metadata(): extracts figcaption text + copyright/credit
per image URL using 3 strategies (figure+figcaption, data-* attributes,
adjacent credit spans)
- ExtractedArticle gets new image_metadata field
- extracted_article_to_meta() includes image_metadata in stored JSON
pipeline.py:
- After auto image selection, check if selected_url is set
- Articles without usable image → status "no_image" (excluded with Telegram notice)
- PipelineStats and summary report include no_image counter
db.py:
- Add "no_image" to articles status CHECK constraint
- Migration: recreates articles table with updated constraint on existing DBs
workflow.py / main.py:
- Map no_image as own UI status with rewrite/close transitions
wordpress.py:
- _upload_featured_media() accepts image_caption param, sends to WP media
- _get_image_meta_for_url() / _build_image_caption() helpers
- _build_attribution_block(): separator + attribution paragraph at article end
(original link, author, Bildnachweis/credit)
- _build_post_content() appends attribution block
telegram_bot.py:
- notify_pipeline_done() shows 🖼️ no-image count
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Webhook returns 200 immediately, processing runs in background task
→ Telegram no longer retries, eliminates duplicate callbacks and 400 errors
- Consolidate answer_callback_query call to top of handler (before heavy work)
- Add logger.info/error for callback actions to aid debugging
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pipeline runs in background via asyncio. Endpoint returns immediately,
results arrive via Telegram notifications.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>