In a recent video I showed off an early attempt at live machine translation of Pawapuro:
Here’s the method I used to make that video. Keep your expectations in check; this isn’t a brand new all-in-one application to do this (although I’m tempted to try now that I know how it all works), and it’s a little bit of a pain to setup, but it’s worked better than any other tool I’ve tried for the use case of Pawapuro‘s visual novel-like Success Modes.
You have to embrace some jank to have five programs open to do one thing. You have to embrace some jank when you’re using OCR (Optical Character Recognition) for Japanese, a language with thousands of characters. You have to embrace some jank when you’re machine translating. That’s why I call this the PawaJank Translation Workflow.
Necessary items:
- Python 3.8, 3.9, 3.10, or 3.11
- Manga_ocr
- ShareX
- AutoHotKey
- This AutoHotKey script
- A machine translator like DeepL or Google Translate. My preference has been DeepL based on testing results so far, the option to use a Windows app, and the free glossary feature.
How to Install
Install Python. Make sure you check the box that says “Add Python 3.[x] to PATH” in the installer.
Install manga_ocr. If you followed the previous step, this should be as simple as :
1. Open your Command Prompt (Press Windows key, type in “cmd” then hit Enter)
2. Type in “pip3 install manga-ocr”
3. Chill for five minutes while it downloads
But running Python can be irritating on Windows sometimes. If typing in “python” on your Command Prompt returns “‘python’ is not recognized as an internal or external command, operable program or batch file.” then Python isn’t installed or isn’t in PATH correctly. Try following the steps here. If “python” works fine on CMD but “pip3” does not, then you may need to separately install pip3.
If that’s all working, the hard part is over.
Install ShareX. Open it up and under Hotkey Settings add options for Capture Region using F4 and Capture Last Region using F6:
By default, ShareX will make a little camera noise each time it takes a new capture. You can turn this off under Task Settings in the Notifications page:
Install AutoHotKey.
Download this AutoHotKey script.
How to Use
Run manga_ocr and leave it open. Open your Command Prompt (Press Windows key, type in “cmd” then hit Enter). Type in “manga_ocr” without the quotes. Hit Enter. If it looks like the following picture, you’re golden. Leave that Command Prompt window open.
Open ShareX and leave it open.
Launch the AutoHotKey script. You should be able to just double-click the file and it will get started silently.
Open up your favorite machine translation tool and set it to Japanese -> English. DeepL has a
Open up your game or stream of choice. I use real hardware fed into a capture card and OBS, but probably the easier (and slightly less delayed) thing will be an emulator like RetroArch, Dolphin, PCSX2, whatever. The game window can be displayed at any reasonable size where the Japanese text is legible.
Press F4 then select the area of the screen you want to translate. For best results, you generally want to capture just inside the border of the main textbox on-screen:
Or if you just want to grab a certain word from somewhere else, capture the smallest area around the word you can (aiming for the name of that value 4 “defensive coordination” card in the bottom-middle):
Both examples above successfully captured the text perfectly. But capturing the full “defensive coordination” card with the picture of Pawapuro-kun included returned junk. And capturing the full screen also returned junk.
ShareX will take a pic of that region and put it on your clipboard. Manga_ocr will attempt to read the Japanese characters and replace your clipboard with those characters. You can now paste as normal (CTRL+V) in the Japanese textbox in your machine translator to try out the results.
Once you’re ready to speed up the process, click into your machine translator’s Japanese text box and press F5 (recapture last region) or F7 (start auto-capturing). F5 will automatically grab a new picture of the last region you used, grab the Japanese text, blank out your translation textbox, and paste in the new text, generating a new translation. F7 does the same thing on a regular schedule, every 4 seconds by default. F8 will stop the auto-capture. Either option is good for when new text is in the same textbox and you just need to take a new picture of the last region you used.
Press ESC at any time to exit the AutoHotKey script. For when you’re done playing and ready to get your function keys back.
Quick Usage Guide
If you’ve already set everything up and just need to remind yourself what the keys are:
F4 – Set new region to capture
F5 – Recapture your last region and paste the Japanese text
F7 – Start auto-capturing the same region on a time interval (4 sec. default)
F8 – Stop auto-capturing
ESC – Close out of the AutoHotKey script.
Configuration
If you would like to use different hotkeys or change the auto-capture interval, these things are configurable in the AutoHotKey script. Open it up in Notepad or something similar, change the values, save, and then double-click the script to launch it. If you didn’t save over the previous file, make sure you end the previous script in AutoHotKey before launching the new one or you’ll have two going at the same time.
This does require knowing Japanese at least a tiny bit, but if you’re using DeepL, you can add items to your glossary, which is manual work but really helpful over time. Machine translation can struggle with proper nouns, especially things like player names and team names. And the OCR might make the same mistake over and over, which can also be worked around by adding it to the glossary.
Various caveats and notes
- Unfortunately if manga_ocr sees little or no text, it will just make something up. Hopefully this changes in a future version. If you see some truly random but generic little sentence in your translator like “Each is fine, you know” or “Using this website,” probably manga_ocr didn’t actually detect any Japanese words.
- Maybe it just seems this way because katakana is the easiest character set for me, but manga_ocr seems to struggle a surprising amount with katakana. Mix-ups with the diacritics are pretty common. I’ve seen ベースボール (baseball) get read as ペースボール (paceball) and パワフルズ (Powerfuls) become バブフルズ (Bubblefuls?) in early testing. You can add these mistakes to your DeepL app glossary to work around it, but it’s a little frustrating.
- The OCR is better on some games than others for sure. Even a little gradient at the corner of the textbox can throw it off and make the results worse. Pawapuro 2009 definitely worked better than Pawapuro 10 in my early testing for example.
- Very specific use case note: If you use RetroArch through Steam, be aware that it requires window focus to take controller input. This translation method requires browser focus to be on the translator, so we can’t have that. You can launch RetroArch outside of Steam to avoid this problem (right click RetroArch in Steam > Browse local files > double-click RetroArch.exe.]
Other options I’ve tried
This isn’t the only Japanese OCR tool out there. Running through a quick history of other solutions I’ve tried and their positives and negatives before I started doing it this way.
Mobile app machine translation – I talked about options for this under Real-time Camera Translation here. The Google Translate app works pretty well just by opening it up and pointing it at your screen. It doesn’t refresh automatically, so you have to pull a Time Crisis and quickly point your phone away then back at the screen when new text appears. It’s awkward and crazy-looking but I still think setting a phone up on a tripod then covering the lens when you want to scan for new text would work decently. The DeepL app does not read text as well, though I think its translation quality is better when it reads the characters correctly.
ScreenTranslator – This was the start of my journey after seeing Emmdotfrisk link this GitHub. The OCR is done with Tesseract (which I use at my day job) which unfortunately isn’t up to the task (yet). If the text is pure black font like you typed Japanese into a word processor, it works great. If it’s anything else, like basically any Pawapuro story mode font with a basic little outline, it’s totally lost. I tried it with newer Tesseract trained data without any luck.
DeepL app – DeepL has a native Windows app with a screen capture tool. But I wonder if it might also be using Tesseract, because the OCR results are similarly terrible. I mean look at these results on capturing the very clear textbox at the bottom of the Pawapuro 11 main menu. Strange that their mobile app has usable OCR but the desktop app does not, even though the input is cleaner with a pure screen capture than pointing your phone at a monitor.
VGT – A program built on manga_ocr that links it to OpenAI’s translation capabilities. I don’t know how strong the OpenAI translation is, but theoretically the ability to set a custom translation prompt opens up some interesting possibilities, like “Translate this phrase from Japanese to English in a country bumpkin style” to get flavorful output. It does require selecting a new region with each capture, which isn’t very convenient for games with a static textbox. Can confirm the OCR works just as well through this tool, but I was unable to get the translation to produce anything but “Error” so far while using a valid OpenAI key.
Visual Novel OCR – Part of a whole set of translation tools called Sugoi Translator. This sucker is gigantic! Unzipping it alone is taking all day. This looks potentially powerful but in half an hour of trying could not quite wrap my head around how to actually translate something. Will keep looking into this!