Feature Request / Improvement
I noticed that in python, hive, glue and dynamo list all tables, including non-Iceberg ones, in the namespace
|
def list_tables(self, namespace: Union[str, Identifier]) -> List[Identifier]: |
|
"""List tables under the given namespace in the catalog (including non-Iceberg tables). |
|
|
|
When the database doesn't exist, it will just return an empty list. |
|
|
|
Args: |
|
namespace: Database to list. |
|
|
|
Returns: |
|
List[Identifier]: list of table identifiers. |
|
|
|
Raises: |
|
NoSuchNamespaceError: If a namespace with the given name does not exist, or the identifier is invalid. |
|
""" |
|
database_name = self.identifier_to_database(namespace, NoSuchNamespaceError) |
|
with self._client as open_client: |
|
return [(database_name, table_name) for table_name in open_client.get_all_tables(db_name=database_name)] |
|
def list_tables(self, namespace: Union[str, Identifier]) -> List[Identifier]: |
|
"""List tables under the given namespace in the catalog (including non-Iceberg tables). |
|
|
|
Args: |
|
namespace (str | Identifier): Namespace identifier to search. |
|
|
|
Returns: |
|
List[Identifier]: list of table identifiers. |
|
|
|
Raises: |
|
NoSuchNamespaceError: If a namespace with the given name does not exist, or the identifier is invalid. |
|
""" |
|
database_name = self.identifier_to_database(namespace, NoSuchNamespaceError) |
|
table_list: List[TableTypeDef] = [] |
|
next_token: Optional[str] = None |
|
try: |
|
while True: |
|
table_list_response = ( |
|
self.glue.get_tables(DatabaseName=database_name) |
|
if not next_token |
|
else self.glue.get_tables(DatabaseName=database_name, NextToken=next_token) |
|
) |
|
table_list.extend(table_list_response["TableList"]) |
|
next_token = table_list_response.get("NextToken") |
|
if not next_token: |
|
break |
|
|
|
except self.glue.exceptions.EntityNotFoundException as e: |
|
raise NoSuchNamespaceError(f"Database does not exist: {database_name}") from e |
|
return [(database_name, table["Name"]) for table in table_list] |
However, in java, we apply a filter to only return Iceberg tables in the given namespace:
GlueCatalog.listTables
HiveCatalog.listTables
I forgot if we discussed this before: Why do we choose to include non-iceberg tables in the result in python?
cc @Fokko
Feature Request / Improvement
I noticed that in python,
hive,glueanddynamolist all tables, including non-Iceberg ones, in the namespaceiceberg-python/pyiceberg/catalog/hive.py
Lines 488 to 504 in acc934f
iceberg-python/pyiceberg/catalog/glue.py
Lines 584 to 613 in acc934f
However, in java, we apply a filter to only return Iceberg tables in the given namespace:
GlueCatalog.listTables
HiveCatalog.listTables
I forgot if we discussed this before: Why do we choose to include non-iceberg tables in the result in python?
cc @Fokko