-
Notifications
You must be signed in to change notification settings - Fork 655
Dynamic mode operator docstrings #6078
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| return function | ||
|
|
||
|
|
||
| def build_fn_wrappers(all_ops): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved here from the (now removed) _fn.py
|
CI MESSAGE: [37593417]: BUILD STARTED |
| global _all_ops | ||
| _all_ops = _op_builder.build_operators() | ||
| global _all_ops, _all_functions | ||
| _all_ops, _all_functions = _op_builder.build_operators() |
Check notice
Code scanning / CodeQL
Unused global variable Note
| from nvidia.dali.ops import _registry, _names | ||
|
|
||
|
|
||
| _MAX_INPUT_SPELLED_OUT = 5 |
Check notice
Code scanning / CodeQL
Unused global variable Note
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This PR refactors documentation generation and naming for DALI's dynamic mode API. Key changes include:
- Single source of truth for names: Unified naming logic across different API modes (ops, fn, dynamic) in
_names.pyand_docs.py - Dynamic API docstrings: Added comprehensive docstring generation for dynamic mode operators and functions, with automatic replacement of "TensorList" references with "Tensor/Batch"
- Improved input naming: Refactored
_get_input_name()to handle name collisions with operator arguments and removed the__prefix for cleaner signatures - Consolidated function wrappers: Moved function wrapper building from separate
_fn.pyinto_op_builder.pywith proper docstring generation - Argument filtering: Added support for filtering which arguments appear in documentation based on what's actually used in dynamic mode
- New tests: Added comprehensive tests to verify documentation is present and correctly formatted for dynamic API
The refactoring improves code maintainability by reducing duplication and ensures consistent documentation across all API modes.
Confidence Score: 4/5
- This PR is safe to merge with minimal risk, though thorough testing of documentation generation is recommended
- The PR is a well-structured refactoring focused on documentation and naming. The changes are mostly additive (adding dynamic API support) with some consolidation (removing
_fn.py). The logic changes in_get_input_name()for collision detection are sound, and comprehensive tests were added. Score is 4/5 rather than 5/5 due to the complexity of the signature/docstring generation code and the removal of the__prefix from input names which could potentially affect downstream tooling, though the changes appear backward compatible for the public APIs. - Pay closer attention to
dali/python/nvidia/dali/ops/_names.pyfor the input name collision logic, and ensure the documentation tests pass in CI
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| dali/python/nvidia/dali/types.py | 5/5 | Added api parameter to _type_name_convert_to_string to replace "TensorList" with "Tensor/Batch" for dynamic API |
| dali/python/nvidia/dali/ops/_names.py | 4/5 | Refactored input name generation with collision detection; added dynamic API support; removed __ prefix from input names |
| dali/python/nvidia/dali/ops/_docs.py | 5/5 | Added dynamic API support with TensorList→Tensor/Batch replacement; added argument filtering; refactored docstring generators |
| dali/python/nvidia/dali/experimental/dynamic/_op_builder.py | 4/5 | Major refactor: added docstring generation, consolidated function wrapper building, improved input handling with _get_inputs() |
Sequence Diagram
sequenceDiagram
participant User
participant OpBuilder as _op_builder
participant Docs as _docs
participant Names as _names
participant Types as types
participant Schema as OpSchema
User->>OpBuilder: build_operators()
OpBuilder->>OpBuilder: build_constructor(schema, op_class)
OpBuilder->>Schema: GetArgumentNames()
Schema-->>OpBuilder: argument list
OpBuilder->>Docs: _docstring_generator_class(schema_name, api="dynamic", args)
Docs->>Schema: get operator info
Docs->>Docs: _get_inputs_doc(schema, api="dynamic")
Docs->>Names: _get_input_name(schema, input_idx)
Names->>Schema: HasInputDox(), GetInputName()
Names->>Schema: HasArgument(name) [collision check]
Names-->>Docs: input name
Docs->>Docs: _adjust_dox(dox, api="dynamic")
Note over Docs: Replace "TensorList" with "Tensor/Batch"
Docs->>Docs: _get_kwargs(schema, api="dynamic", args)
Docs->>Types: _type_name_convert_to_string(dtype, allow_tensors, api="dynamic")
Types-->>Docs: type string with "Tensor/Batch"
Docs-->>OpBuilder: complete docstring
OpBuilder->>OpBuilder: makefun.create_function(header, init, doc)
OpBuilder->>OpBuilder: build_call_function(schema, op_class)
OpBuilder->>Docs: _docstring_generator_call(schema_name, api="dynamic", args)
Docs->>Docs: Similar flow as above
Docs-->>OpBuilder: call docstring
OpBuilder->>OpBuilder: makefun.create_function(header, call, doc)
OpBuilder->>OpBuilder: build_fn_wrappers(all_ops)
loop for each operator
OpBuilder->>OpBuilder: build_fn_wrapper(op)
OpBuilder->>Names: _get_inputs(schema)
Names->>Names: _get_input_name() with collision detection
Names-->>OpBuilder: input parameters
OpBuilder->>Docs: _docstring_generator_fn(schema_name, api="dynamic", args)
Docs-->>OpBuilder: function docstring
OpBuilder->>OpBuilder: makefun.create_function(header, fn_call, doc)
end
OpBuilder-->>User: all_op_classes, all_fn_wrappers
11 files reviewed, no comments
|
CI MESSAGE: [37593417]: BUILD FAILED |
|
|
|
|
|
|
Signed-off-by: Michal Zientkiewicz <[email protected]>
Signed-off-by: Michał Zientkiewicz <[email protected]>
Signed-off-by: Michal Zientkiewicz <[email protected]>
Signed-off-by: Michal Zientkiewicz <[email protected]>
Signed-off-by: Michal Zientkiewicz <[email protected]>
Signed-off-by: Michal Zientkiewicz <[email protected]>
Signed-off-by: Michal Zientkiewicz <[email protected]>
Signed-off-by: Michal Zientkiewicz <[email protected]>
e00718f to
d617549
Compare
Those are quite valid points, but the fix is just to write the documentation. This PR is about checking the auto-generated docstrings for operators. |
Fixed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This PR refactors operator documentation generation to support the dynamic mode API with proper docstrings.
Key changes:
- Unified name and signature generation across ops, fn, and dynamic APIs using a single source of truth
- Added
api="dynamic"parameter throughout the documentation stack to replace TensorList references with Tensor/Batch - Improved input name collision detection in
_get_input_name()to prevent conflicts with operator arguments - Consolidated function wrapper creation into
build_operators()instead of separate initialization - Added
schema.Name()method binding in C++ backend for consistent schema name access
Implementation highlights:
- The
_get_inputs()helper function standardizes positional input parameter generation across constructors and function wrappers - Documentation generators now filter arguments based on
used_kwargsto show only relevant parameters - Input names use collision avoidance logic (prepending
__or appending_input) when conflicts with argument names occur - The
_MAX_INPUT_SPELLED_OUTconstant (5) controls when variadic*inputsis used vs spelling out individual parameters
Testing:
Tests verify that dynamic mode documentation excludes TensorList and graph references, ensuring proper API-specific terminology
Confidence Score: 5/5
- This PR is safe to merge with no critical issues found
- The refactoring is well-structured with consistent patterns across the codebase. All changes are additive/refactoring with no breaking modifications to existing APIs. The code properly handles edge cases (zero inputs, collision avoidance, optional parameters) and includes comprehensive tests to validate the documentation generation
- No files require special attention
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| dali/python/nvidia/dali/ops/_names.py | 5/5 | Refactored input name generation with collision detection and added support for dynamic API naming conventions |
| dali/python/nvidia/dali/ops/_docs.py | 5/5 | Extended documentation generators to support dynamic API with Tensor/Batch types and argument filtering |
| dali/python/nvidia/dali/experimental/dynamic/_op_builder.py | 5/5 | Integrated docstring generation, refactored signature building with _get_inputs(), and consolidated function wrapper creation |
| dali/test/python/experimental_mode/test_docs.py | 5/5 | Comprehensive test suite verifying dynamic mode documentation correctness, checking for TensorList/graph references and doc presence |
Sequence Diagram
sequenceDiagram
participant User
participant DynamicOps as dynamic.ops
participant OpBuilder as _op_builder
participant Docs as _docs
participant Names as _names
participant Backend as backend_impl
User->>DynamicOps: Import dynamic module
DynamicOps->>OpBuilder: _initialize() -> build_operators()
loop For each registered operator
OpBuilder->>Backend: GetSchema(schema_name)
Backend-->>OpBuilder: OpSchema
OpBuilder->>OpBuilder: build_constructor(schema, op_class)
OpBuilder->>Names: _get_input_name(schema, i)
Names->>Backend: schema.GetInputName() / schema.HasArgument()
Names-->>OpBuilder: input_name (with collision avoidance)
OpBuilder->>Docs: _docstring_generator_class(schema.Name(), api="dynamic")
Docs->>Names: _get_input_name(schema, i)
Docs->>Docs: _adjust_dox(dox, api) - Replace TensorList with Tensor/Batch
Docs-->>OpBuilder: Constructor docstring
OpBuilder->>OpBuilder: build_call_function(schema, op_class)
OpBuilder->>Docs: _docstring_generator_call(schema.Name(), api="dynamic")
Docs-->>OpBuilder: __call__ docstring
OpBuilder->>OpBuilder: build_fn_wrapper(op)
OpBuilder->>OpBuilder: _get_inputs(schema)
OpBuilder->>Docs: _docstring_generator_fn(schema.Name(), api="dynamic")
Docs-->>OpBuilder: Function docstring
end
OpBuilder-->>DynamicOps: (all_op_classes, all_fn_wrappers)
DynamicOps-->>User: Operators with dynamic mode docstrings
11 files reviewed, no comments
|
CI MESSAGE: [37806636]: BUILD STARTED |
|
|
||
|
|
||
| def _docstring_prefix_from_inputs(op_name): | ||
| def _call_description(api): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear from the docstring what to expect when calling a reader. Can some generic description be filled here or should it go in the class docstring instead (out of the scope of this PR)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of __call__ on operators is going to be rather an exception than a pattern. I think we can work it out later, if necessary.
szkarpinski
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a follow-up, please address the remaining comments. In particular, let's make sure that reader's documentation is helpful
|
CI MESSAGE: [37809283]: BUILD STARTED |
|
CI MESSAGE: [37809283]: BUILD PASSED |
* Add name generation for dynamic API. * Limit the number of spelled-out arguments. * Adjust argument type name for dynamic API (use Tensor/Batch instead of TensorList) * Add docstrings for dynamic mode ops and functions. * Fix __call__ documentation. Update tests to look for mention of graph. Signed-off-by: Michal Zientkiewicz <[email protected]>
Category:
New feature (non-breaking change which adds functionality)
Refactoring (Redesign of existing code that doesn't affect functionality)
Description:
This PR refactors the name and signature generation and uses (to an extent possible) a single source of truth for the names used in different modes. It also adds
"dynamic"API to the name-generating functions, adds argument filtering and replaces TensorList with Tensor/Batch for the dynamic API.Additional information:
Affected modules and functionalities:
Signatures, dynamic mode op_builder, documentation.
Key points relevant for the review:
Tests:
Existing test check for breakage. Documentation is yet another test (both that it builds and looks ok).
Checklist
Documentation
DALI team only
Requirements
REQ IDs: N/A
JIRA TASK: DALI-4464