Add optional argument so nltk files can be downloaded in a specific directory#289
Add optional argument so nltk files can be downloaded in a specific directory#289himperion92 wants to merge 1 commit intosloria:devfrom
Conversation
JiwaniZakir
left a comment
There was a problem hiding this comment.
The main() function in download_corpora.py now reads NLTK_DATA from the environment to set download_dir, but this means there's no way to pass a custom download directory via the CLI — if NLTK_DATA isn't set, download_dir will be None and NLTK falls back to its default paths, which is fine, but a --download-dir CLI argument would make the feature more directly usable and align with the PR's stated goal of allowing a specific directory to be passed. As written, the only way to exercise the new download_dir parameter in download_lite and download_all is through the environment variable, not through the command line.
Additionally, NLTK_DATA is already natively recognized by NLTK itself (it's checked internally during path resolution), so reading it here and passing it explicitly is somewhat redundant — NLTK will honor that env var on its own without this code change. It would be worth verifying whether the real use case is something NLTK doesn't already handle (e.g., a non-standard env var name or a path not in NLTK_DATA), and if so, documenting that intent clearly. If the goal is purely to support NLTK_DATA, the added code may not be necessary at all.
No description provided.