import google.generativeai as genai import ollama from titlecase import titlecase
audrey.feldroy.com
The experimental notebooks of Audrey M. Roy Greenfeld. This website and all its notebooks are open-source at github.com/audreyfeldroy/audrey.feldroy.com
# Building a Better Title-Caser, Part 1: Beyond Python str.title
by Audrey M. Roy Greenfeld | Fri, Feb 14, 2025
Title-casing text is one of those hard problems no one ever gets right, yet no one considers worthy enough to solve with AI. Here I experiment to see if I can improve upon the latest best solutions with a local Ollama modelfile and a solid prompt.
Setup
Built-In title
I begin by seeing what str.title
does. It's built into Python, so nothing needs to be installed.
"hi".title()
PyPI titlecase
Now with pip install titlecase
, I get this titlecase
function:
titlecase("hi")
Simple Multi-Word Ccomparison
Both of these functions should do well with a simple test case:
text = "the quick brown fox" print(f"title(): {text.title()}") print(f"titlecase(): {titlecase(text)}")
With Apostrophes
## With apostrophes text2 = "it's a beautiful day in mr. rogers' neighborhood" print(f"\ntitle(): {text2.title()}") print(f"titlecase(): {titlecase(text2)}")
Here titlecase
lowercased the articles correctly.
Modern Terms With Unconventional Capitalization
text3 = "iphone and e-mail tips for pdfs" print(f"\ntitle(): {text3.title()}") print(f"titlecase(): {titlecase(text3)}")
My use case would be to title case voice-dictated text. Here there's something tricky because E-Mail is one of those terms where the hyphenation is debatable and undergoing change. Personally, I prefer email without the hyphen. It's interesting how I voice-dicated this paragraph (Wispr Flow) and it ended up both ways!
My preference for a return value here is iPhone and Email Tips for PDFs
. In situations where a hyphenated word is optionally unhyphenated, I'd like the title-casing function to unhyphenate and then title-case it. If that's not possible, my backup preference is iPhone and E-Mail Tips for PDFs
.
Using a Hosted LLM as a Title-Caser
def tc_gemini(s): model = genai.GenerativeModel('gemini-1.5-flash-latest') resp = model.generate_content(f"Convert '{s}' to title case, please. Return ONLY the title-cased string.", safety_settings=[], request_options={"timeout": 1000}) try: return resp.text except Exception as ex: raise ex
tc_gemini(text3)
Gemini 1.5 Flash works decently as a title caster with this simple prompt. I noticed though that the mail and email isn't capitalized. That is one that people find confusing. The rule is when a word is hyphenated, each part of the hyphenated word should be capitalized.
This feels a bit wasteful though with a lot of API calls to a service that will likely cost money in the future. I suppose you'd want to batch them if you went this way. I think it would be a lot nicer though to use a small local LLM for simple tasks like this.
Use Small Local LLMs as Title-Casers
def tc_ollama(s, model='mistral'): # Call ollama with a simple title-case prompt response = ollama.chat(model=model, messages=[{ 'role': 'user', 'content': f"Convert '{s}' to title case. Return ONLY the title-cased string with no explanation or quotes." }]) return response['message']['content'].strip()
print(tc_ollama(text3))
Mistral is quite good. Let's try others:
# Let's try a few different models to compare models = ['llama3.2', 'tinyllama', 'deepseek-r1:7b', 'deepseek-coder:33b', 'qwen2.5:3b'] print("\nComparing models:") for model in models: try: print(f"{model:10}: {tc_ollama(text3, model)}") except: print(f"{model:10}: Failed")
© 2024-2025 Audrey M. Roy Greenfeld