Paula Guerrero Castelló

NLP Specialist & Computational Linguist

📍 Alicante, Spain 🎓 MSc NLP @ UPV/EHU 🔬 Machine Translation

01. About

Currently I am a Master's student in Natural Language Analysis and Processing at the University of the Basque Country (UPV/EHU), where my research focuses on machine translation for low-resource languages, LLM fine-tuning, and multilingual corpus development. Prior to this, I completed a degree in Translation and Interpreting at the Universitat d'Alacant, with English, French, Spanish, and Chinese as working languages — an experience that gave me a strong grounding in linguistics, cross-linguistic analysis, and translation practices.

My main research interests lie at the intersection of language technologies and translation: specifically, improving MT quality for under-resourced languages, adapting large language models to low-data settings, and studying how machines and humans approach the translation task differently. I am particularly interested in regional languages such as Valencian and in the literary domain adaption of MT models.

Outside of research, I have a passion for travel and new experiences. I spent a semester on Erasmus exchange in Brussels, took part in a Work and Travel programme in New York, and whenever I get the chance, I find myself planning the next trip. Living and studying across different countries has shaped the way I think about language, culture, and communication.

02. Experience

NLP Intern (Conversational AI)

1MillionBot Apr 2026 – Now · Alicante, Spain
  • Training and configuration of conversational AI models for customer service applications.

NLP Intern

Prompsit Language Engineering Feb 2025 – Jun 2025 · Elche, Spain
  • Evaluated and curated multilingual corpora for AI training and machine translation quality assessment.
  • Developed Machine Translation systems using Seq2Seq architectures with Python.
  • Contributed to the EU-funded MaCoCu project, promoting under-resourced languages through corpus annotation on the Keops platform.
  • Identified and categorized errors in multilingual corpora; collaborated on annotation guidelines.
  • Evaluated MUTNMT and PROMUT MT training applications.

Aquatics Team Member

Six Flags Darien Lake Jun 2025 – Aug 2025 · Darien Center, NY, USA
  • International work-and-travel experience in a high-traffic, multilingual environment.
  • Developed cross-cultural communication and teamwork skills.

03. Education

MSc

Natural Language Analysis and Processing

University of the Basque Country (UPV/EHU)

Oct 2025 – Feb 2027 · San Sebastián, Spain

Deep learning & transformer architectures (PyTorch), NLP, Machine Learning, Machine Translation, LLMs, corpus creation, RAG, Question Answering, Named Entity Recognition, Speech Recognition and Synthesis, Text Classification, Information Retrieval. Focus on low-resource language MT.

BSc

Translation and Interpreting

University of Alicante

Sep 2021 – Jun 2025 · Alicante, Spain

English, French, Spanish and Chinese. Final GPA: 8.85/10. Thesis: Video Game Localization Through a Gender Lens: The Case of Cyberpunk 2077..

🌍

Erasmus Exchange — Translation & Interpreting

Université Saint-Louis, Brussels

Sep 2023 – Feb 2024 · Brussels, Belgium

Immersive multilingual academic experience at a leading Belgian university. French-speaking environment within the European Union context, strengthening intercultural communication and translation skills.

Corpus Processing & Programming for Computational Linguists

Self-taught — Francis Tyers (Indiana University)

Self-paced

Python & Unix for corpus processing and computational linguistics.

BachiBac — Baccalauréat & Bachillerato

IES Carrús, Elche

Sep 2019 – Jun 2021 · Elche, Spain

Dual French–Spanish qualification. Graduated with highest honours: 9.98/10.

04. Projects

Clinical Text Simplification with Fine-Tuned LLMs

Feb 2026 – Present

Fine-tuned Qwen3.5-0.8B and Llama-3.2-1B with LoRA on the CLARA-MeD clinical corpus for automatic medical text simplification in Spanish. Achieved results comparable to human simplifications in meaning preservation. Deployed interactive demos on Hugging Face.

LoRA Text Simplification Healthcare NLP

MaCoCu Corpus Validation

Prompsit Language Engineering · 2025

Performed detailed validation of corpus segments for the EU-funded MaCoCu project. Created annotation guidelines for bilingual corpora; resolved segmentation, coherence, and cross-lingual alignment issues using the Keops platform.

Corpus Annotation EU Project Under-resourced Languages

05. Skills

NLP & AI

  • PyTorch & Transformer Architectures
  • LLM Fine-Tuning (SFT, GRPO/RL, LoRA)
  • Machine Translation (Seq2Seq, Neural MT, LLMs)
  • Text Classification & Sequence Labeling
  • Named Entity Recognition (NER)
  • Text Summarization & Question Answering
  • Hugging Face Ecosystem

Programming & Tools

  • Python
  • Scikit-Learn, NumPy, Pandas
  • Corpus Processing & UNIX
  • Pytorch, Transformers, Spacy, NLTK
  • CAT Tools (SDL Trados, MemoQ)
  • Subtitling Software (Aegisub, Subtitle Edit)

Languages

  • Spanish — Native
  • Catalan/Valencian — Native
  • English — C1/C2 Proficient
  • French — C1/C2 Proficient
  • Chinese (Mandarin) — B1 Independent
  • German — A2 Basic

Research & Linguistics

  • Corpus Creation & Curation
  • Corpus Annotation & Quality Assessment
  • Computational Linguistics
  • Translation Studies
  • Low-Resource Language Processing
  • Localization & Gender Studies in Translation

06. Contact

I'm open to research collaborations or NLP-related projects. Feel free to reach out.