rss-news

Author	SHA1	Message	Date
OliverGiertz	2d02b56b65	feat(admin): WordPress→DB sync for scheduled slots Adds sync_db_from_wordpress() that treats WordPress as source of truth: - future posts: update scheduled_publish_at to WP's actual date - draft posts: clear scheduled_publish_at (not yet scheduled) - published posts: mark article as 'published' in DB - trashed/deleted posts: clear wp_post_id + wp_post_url + slot so article can be re-processed Exposed via POST /admin/wp-sync with a sync button on the schedule page. Run after any manual rescheduling in WordPress to bring DB back in sync. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-10 08:53:44 +00:00
OliverGiertz	764e7bff6a	fix(ingestion): skip data: URIs and known placeholder images - ingestion.py: filter out data:image/... inline URIs before ranking - ingestion.py: penalise (-300) known placeholder paths (some-default.jpg etc.) - wordpress.py: _is_usable_image_url rejects data: URIs and placeholder paths Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-07 09:09:44 +00:00
OliverGiertz	426a799371	fix(wordpress): use status=future for posts with a future scheduled_publish_at WordPress ignores the date field for draft posts and shows "Sofort veröffentlichen" instead. Setting status=future causes WP to display and honour the scheduled date, auto-publishing the post at the given time as intended. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-29 14:29:25 +00:00
OliverGiertz	1a8d0775c7	fix(wordpress): correctly detect bare credit marker prefix before caption fallback Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-27 08:47:09 +00:00
OliverGiertz	45c533c674	fix(wordpress): extract credit portion from caption for attribution block When the credit field only captured a marker prefix (e.g. "Foto:") due to CSS-class-based extraction picking up only the label element, fall back to regex-extracting the credit line from the full figcaption caption text. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-27 08:41:28 +00:00
OliverGiertz	d1cb809852	fix(wordpress): fix attribution block source name and image credit lookup - Derive real source hostname from canonical URL when feed name is generic (e.g. "Google Alerts"), so the link shows "moin.de" instead of "Google Alerts" - Use _get_image_meta_for_url() (fuzzy URL matching) for image credit lookup - Use caption field for Bildnachweis since it already contains embedded credits Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-27 08:28:44 +00:00
OliverGiertz	82f2df610d	fix(wordpress): fuzzy URL match for image metadata and simplify caption builder Image metadata keys may have query params (e.g. ?w=1200) that differ from the selected_url stored in image_review. Fall back to comparing URLs without query string so the figcaption text is correctly found. Also simplified _build_image_caption: figcaption text already contains the credit info, so just use caption directly instead of appending the redundant credit prefix marker. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-27 08:24:40 +00:00
OliverGiertz	aaac5def27	feat(pipeline): image caption/credit extraction, no-image exclusion, WP attribution source_extraction.py: - New _extract_image_metadata(): extracts figcaption text + copyright/credit per image URL using 3 strategies (figure+figcaption, data-* attributes, adjacent credit spans) - ExtractedArticle gets new image_metadata field - extracted_article_to_meta() includes image_metadata in stored JSON pipeline.py: - After auto image selection, check if selected_url is set - Articles without usable image → status "no_image" (excluded with Telegram notice) - PipelineStats and summary report include no_image counter db.py: - Add "no_image" to articles status CHECK constraint - Migration: recreates articles table with updated constraint on existing DBs workflow.py / main.py: - Map no_image as own UI status with rewrite/close transitions wordpress.py: - _upload_featured_media() accepts image_caption param, sends to WP media - _get_image_meta_for_url() / _build_image_caption() helpers - _build_attribution_block(): separator + attribution paragraph at article end (original link, author, Bildnachweis/credit) - _build_post_content() appends attribution block telegram_bot.py: - notify_pipeline_done() shows 🖼️ no-image count Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-27 07:08:48 +00:00
OliverGiertz	1963e32ab4	fix(rewrite): make image upload non-fatal and add rewrite tracing logs - wordpress.py: catch image download/upload failures and skip image instead of aborting the entire WP draft update - pipeline.py: add INFO logs at each step of _do_rewrite_and_draft to trace OpenAI call, tag generation, and WP API call - telegram_bot.py: add INFO logs around rewrite execution + exc_info on error for full traceback in logs - repositories.py: include scheduled_publish_at in get_article_by_id Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-26 07:45:55 +00:00
OliverGiertz	970f509ad4	feat(wordpress): store suggested publish date directly in WP draft Reserve the publish slot before creating the WP draft so the scheduled_publish_at timestamp is available when building the post payload. WordPress receives the `date` field (e.g. 2026-03-24T09:00:00) which sets the scheduled publish time on the draft. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-21 11:15:39 +00:00
OliverGiertz	6192f8e527	feat(automation): autonomous pipeline with Telegram bot and N8N integration - Add full auto pipeline: RSS ingest → GPT relevance score → AI rewrite → WP draft - Add Telegram bot with inline buttons (rewrite/discard/override) and commands (/run, /rejected, /status) - Add smart publish scheduler: max 2 drafts/day, spread over week (09:00 & 14:00 CET) - Add N8N API endpoints (/api/n8n/pipeline, /api/n8n/ingest) with X-API-Key auth - Add GPT-based relevance scoring (0-100) for VanLife/Camping/Outdoor topics - Remove Ampel risk-level policy check from ingestion (all enabled feeds are used) - Add Telegram webhook endpoint and setup endpoint - Add delete_wp_post() for Telegram discard action - Add DB migrations for relevance_score and scheduled_publish_at columns - Update .env.example with all new configuration variables - Add docs/AUTOMATION.md with full setup and usage documentation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-21 09:40:15 +00:00
Oliver G	6332a9a399	feat(wordpress): publish true Gutenberg blocks and remove auto summary/details sections	2026-02-21 14:55:20 +01:00
Oliver G	b0f995d5c9	feat(rewrite): add batch rewrite run, AI tags for WP, and agentur contact detection	2026-02-21 14:39:47 +01:00
Oliver G	88b2ee1d01	feat(admin): add feed/source management, rewrite editor, reopen flow, and WP block output	2026-02-21 14:03:49 +01:00
Oliver G	35ccceb260	feat(workflow): simplify article flow and add automated rewrite step	2026-02-21 13:43:22 +01:00
Oliver G	24d8e5ad0f	feat(wordpress): improve post html structure and excerpt generation	2026-02-21 13:09:00 +01:00
Oliver G	e68b6a41fd	feat(wordpress): upload selected image and set featured_media on draft publish	2026-02-21 13:07:08 +01:00
Oliver G	1cee56205e	feat(publisher): add wordpress draft queue with retry and admin controls	2026-02-18 10:49:43 +01:00

18 commits