Feat: Add label filter support in pgdiskann client#724
Feat: Add label filter support in pgdiskann client#724XuanYang-cn merged 7 commits intozilliztech:mainfrom
Conversation
|
/assign @XuanYang-cn |
|
@XuanYang-cn Could you please review this PR? |
|
@alwayslove2013 Hi, this PR has been pending review for a while. Could you please take a look? |
|
Hi @EeshaaKhan, thanks for the contribution, and apologies for the delayed response! I'll review this PR shortly. Before I do, could you please rebase onto the latest main? We've since upgraded to Pydantic v2 and updated several checks, so this PR didn't trigger the GitHub Actions workflow and may fail against the current codebase. |
4a70c07 to
2e4459c
Compare
|
Hi @XuanYang-cn , |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: EeshaaKhan, XuanYang-cn The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
XuanYang-cn
left a comment
There was a problem hiding this comment.
Thanks for the PR! The overall approach is clean and consistent with how pgvector and cockroachdb handle label filtering.
Worth noting
ALTER TABLE ... SET STORAGE PLAINruns unconditionally in_create_table, regardless ofwith_scalar_labels. Disabling TOAST compression affects storage and performance for every PgDiskANN benchmark — worth a brief comment in the code explaining why it's needed, and a mention in the PR description.self._scalar_label_field = "label"(singular) whileLabelFilter.label_fielddefaults to"labels"(plural). A short comment on the column name choice would prevent future confusion.
|
@XuanYang-cn Thanks for the feedback! I've addressed both points:
|
…zation" This reverts commit d10b296.
This PR adds support for label-based filtering in pgdiskann client.
Changes introduced:
SET STORAGE PLAIN) to disable TOAST compression on the embedding column for improved query performance