README.md

fedora-localization-statistics

This project aims at computing global statistics for Fedora/Linux operating system.

Licensing

It is licensed as AGPL3.0 for its content generation, and MIT for the website layout.

It uses content from CLDR, licensed under the Unicode Inc. License Agreement.

It publishes content (compendiums, terminologies and translation memories) extracted from Fedora packages, which are upstream packages. Please refer to each Fedora package licenses to get all licenses.

Run it locally Fedora

Update the configuration.json file:

{
  "fedora_releases": ["list of values, from f7 for Fedora 7 to f40 for Fedora 40"],
  "staging_srpm_regex": "a regex to select a subset of fedora packages, useful for testing",
  "tm_for_versions": ["f40"]
}
virtualenv venv
source venv/bin/activate
pip install -r requirements.txt
./run_all.py --env production --scope extract
./run_all.py --env production --scope compute
cd website
hugo server -D
````

This takes from 10 to 20 hours to process, you may wish to reduce the number of packages to scan.
To do that, read the `staging_srpm_regex` field in `configuration.json` comments.

# Run it anywhere with docker

```bash
# build local Dockerfile image
podman build . --tag fedlocstats:latest
# run $script (look ad check_dnf_files for examples)
script="./run_all.py --env production --scope extract"
podman run -it --rm -v ./:/src:z -v ./results:/src/results:z -e DNF_CONF=dnf_${release}.conf -e TMP_DIR=/src/results/f${release}/tmp fedlocstats:latest $script