Skip to content

[Performance] Data viewer can't handle large DFs #3434

@FranciscoRZ

Description

@FranciscoRZ

Environment data

  • VS Code version: 1.33.1
  • Extension version (available under the Extensions sidebar): 2019.4.1
  • OS and version: Windows 7
  • Python version (& distribution if applicable, e.g. Anaconda): Anaconda distribution, Python 3.6.2
  • Type of virtual environment used (N/A | venv | virtualenv | conda | ...): conda
  • Relevant/affected Python packages and their versions: None

Expected behaviour

View large DataFrames (>1000 columns, >1000 rows) in under 1 minute

Actual behaviour

When opening large DFs (current is 709x3201) the Data Viewer stops at showing the structure with all values at 'loading ...' (current runtime 20 minutes).

Steps to reproduce:

  1. Create synthetic data frame: 3000 series of 700 floats each
  2. In variable explorer click view in data viewer

Logs

Output for Python in the Output panel (ViewOutput, change the drop-down the upper-right of the Output panel to Python)

None

Output from Console under the Developer Tools panel (toggle Developer Tools on under Help; turn on source maps to make any tracebacks be useful by running Enable source map support for extension debugging)

Can't find relevant logs. Is 'View in Data Viewer' supposed to show up in the logs at some point ?

I was really looking forward to these features, so thanks for getting them in there! However, when dealing with quantitative finance problems we often have very large dataframes, and it would be nice to be able to use the data viewer to explore them.

Best regards,

Francisco

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions