Simple Docker container that outputs the Work From Home percentage for each timesheet pdf in a folder.
docker build -t timesheet:latest .
Assuming you're on a Windows machine and are in the folder that contains your timesheet pdfs:
docker run -v ${pwd}:/data timesheet
If the output is not what you expect, run the included commands one by one.
You need pdftotext, which you can install on Linux with
apt-get install poppler-utils
and MacOS with
brew install poppler
Run the file input.pdf through pdftotext and write the result to output.txt
pdftotext -layout input.pdf output.txt
Pipe output.txt through the python script
cat output.txt | python process_timesheet.py
Output pdftotext to stdout and pipe that directly through the Python script:
pdftotext -layout input.pdf - | python process_timesheet.py