Skip to content

Add optional argument so nltk files can be downloaded in a specific directory#289

Open
himperion92 wants to merge 1 commit intosloria:devfrom
himperion92:dev
Open

Add optional argument so nltk files can be downloaded in a specific directory#289
himperion92 wants to merge 1 commit intosloria:devfrom
himperion92:dev

Conversation

@himperion92
Copy link
Copy Markdown

No description provided.

Copy link
Copy Markdown

@JiwaniZakir JiwaniZakir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main() function in download_corpora.py now reads NLTK_DATA from the environment to set download_dir, but this means there's no way to pass a custom download directory via the CLI — if NLTK_DATA isn't set, download_dir will be None and NLTK falls back to its default paths, which is fine, but a --download-dir CLI argument would make the feature more directly usable and align with the PR's stated goal of allowing a specific directory to be passed. As written, the only way to exercise the new download_dir parameter in download_lite and download_all is through the environment variable, not through the command line.

Additionally, NLTK_DATA is already natively recognized by NLTK itself (it's checked internally during path resolution), so reading it here and passing it explicitly is somewhat redundant — NLTK will honor that env var on its own without this code change. It would be worth verifying whether the real use case is something NLTK doesn't already handle (e.g., a non-standard env var name or a path not in NLTK_DATA), and if so, documenting that intent clearly. If the goal is purely to support NLTK_DATA, the added code may not be necessary at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants