AI Engineer & Data Consultant

I help companies turn messy data and internal knowledge into reliable AI tools, APIs, and automation that actually run in production.

What I do

AI automations and chatbots

Custom AI agents, RAG pipelines, and chatbots that use your own documents, PDFs, and databases. I focus on reliability, observability, and clear failure modes—not just AI demos.

Data extraction and ETL from PDFs and reports

I build robust pipelines to extract, normalize, and validate data from PDFs, Excel, and other unstructured sources. Output can be CSV, database-ready, or API-based.

APIs and backend for AI products

Design and implementation of Python and Node.js backends, REST APIs, and integration layers that connect your AI models to real users and business workflows.

Who I work with

I work with product teams, founders, and operations leaders who need working AI systems—not experiments. Most of my clients are in Europe and the US, across healthcare, HR, real estate, and energy.

Healthcare HR Real estate Energy

Selected projects

AI-powered PDF to structured CSV pipeline

Built an AI-assisted extraction service that converts irregular PDF and Excel reports into structured CSV ready for analytics. Hybrid approach: deterministic table parsing plus LLM-based field mapping and validation, exposed through a REST API and a small web UI.

Tech: Python, FastAPI, PostgreSQL, vector DB, OpenAI

Internal knowledge base and chatbot for a services company

Ingested years of PDF documents and call transcripts into a retrieval-friendly knowledge base. Deployed an internal chatbot for staff that can answer questions, surface relevant snippets, and review drafts using only the company data.

Tech: Python, FastAPI, PostgreSQL, vector DB, LLM APIs

RAG and API layer for compliance and policy questions

Developed a RAG pipeline to answer complex policy and compliance questions, with strict source citation, retrieval filters, and clear confidence handling. Wrapped in a FastAPI backend with authentication and logging for auditability.

Tech: Python, FastAPI, PostgreSQL, vector DB, OpenAI

Research & Publications

ArticleLightning study

Positive Lightning Dominance over Portugal (4–5 November 2025)

Author: Cristian Iordache. Location: Cascais, Portugal.

Abstract

Between 4 and 5 November 2025, the IPMA lightning detection network reported an unusually high ratio of positive to negative cloud-to-ground (CG) flashes over mainland Portugal. In the initial 24-hour period, 23,080 positive and 11,574 negative CG flashes were recorded; by 23:45 UTC on 5 November, totals had reached approximately 37,000 positive and 18,000 negative flashes, yielding a positive-polarity fraction of ≈67%. This note documents the event and characterizes the environment using GFS 0.25° analyses, EUMETSAT imagery, and IPMA radar. Persistent deep-layer shear near 16–17 m/s combined with localized high instability suggests organized convection with extensive stratiform anvils and a dominance of positive CG. The magnitude and duration of the polarity imbalance appear unprecedented for the Iberian Peninsula.

IPMA lightning polarity totals over 24 hours
Figure 1. Polarity totals from IPMA lightning page, 24 h window ending 05 Nov ~08:34 UTC.

1. Observations

Polarity totals were taken from the operational IPMA DEA page (rolling 24 h window; accessed 05 Nov 2025). The operational domain covers mainland Portugal and adjacent Atlantic areas. Screenshots and animations show significant electrical activity over the central coast and offshore.

2. Environmental context

NOAA GFS 0.25° analysis fields were used at 00/06/12/18 UTC on 4 Nov and 00/06 UTC on 5 Nov 2025. Deep-layer shear (0–6 km, approximated via 925 and 500 hPa winds) persisted at 16–17 m/s. Domain mean CAPE held near 50–60 J/kg with localized maxima greater than 2000 J/kg near coastal convergence.

Time series of deep-layer wind shear and CAPE
Figure 2. Time series of 0–6 km shear and mean CAPE from GFS 0.25° over Portugal.
Satellite or radar snapshot over Portugal
Figure 3. Illustrative satellite or radar frame during the event.

3. Discussion

The combination of persistent deep-layer shear and localized instability suggests an organized convective system with extensive stratiform shields and cold cloud tops, favoring asymmetric charge separation and positive CG dominance. This pattern is consistent with prior case studies in sheared convection and mature or decaying MCS stages.

4. Conclusions

This preliminary note documents a rare case of positive CG dominance over mainland Portugal. Hourly polarity data from IPMA, when available, will allow a quantitative link between storm structure and positive CG fraction. Results support further peer-reviewed publication.

Preprint available at Zenodo: https://doi.org/10.5281/zenodo.17538198

Submitted to: Weather – The International Journal of Meteorology (RMetS), November 2025.

Download PDF

Research NotesArchival transients

Cross-matching Historical POSS-I Transients with Gaia DR3 to Identify Vanished Sources

Author: Cristian Iordache. Published: .

Motivation

Historical plates contain many candidate transients and variables. By cross-matching their positions with modern catalogs like Gaia DR3, we can flag sources that lack a present-day counterpart. These can be artifacts, variables with strong amplitude, or interesting vanished sources for follow-up.

Method

A small proof-of-concept takes a list of POSS-I candidate positions, queries Gaia DR3 within a fixed cone, and counts counterparts. Results are summarized in a table and a simple chart. The approach scales to larger sets and can include photometric normalization, artifact filters, and ranked scoring.

Table of POSS-I candidates with RA, Dec, and Gaia counterpart counts
Figure A. Summary table for three sample positions and their Gaia counterpart counts.
Bar chart of Gaia counterparts per candidate
Figure B. Counts of Gaia counterparts per candidate. One candidate has no counterpart.

Example

In the sample of three positions, two have Gaia counterparts and one has none. The absence of a counterpart is a useful signal for deeper checks with other surveys such as Harvard DASCH or Sonneberg.

Next steps

Extend to a larger candidate set, add DSS cutouts, normalize plate photometry, and create a ranked list of no-counterpart cases with confidence scores. A notebook can make the pipeline fully reproducible.

Downloads: Zenodo DOI: 10.5281/zenodo.17555319 · CSV data

LLM Time Server Case Study

A case study on temporal grounding for LLMs.

Read the case study

Let's talk about your project

If you're planning an AI project or need to turn documents and data into something your team can actually use, send me a short description and I'll get back with a concrete plan.