Texera workflow

Collaborative Data Science

  • Access from anywhere
  • Data, workflow sharing
  • Simultaneous editing and executing
  • Real-time follow changes
  • Workflow history and version control

Graphical Workflow Interface

  • Zero code, drag-and-drop to construct a workflow
  • Live visualization of workflow status
  • Interact with the execution

Expandable AI/ML Access

  • Rich collection of out-of-shell operators including AI/ML
  • Integrated code editor for custom operators
  • Multi-language support: Python, R, Java/Scala

Motivation

  • Data science is labor-intensive and particularly challenging for non-IT users applying AI/ML.
  • Many workflow-based data science platforms lack parallelism, limiting their ability to handle big datasets.
  • Cloud services and technologies have advanced significantly over the past decade, enabling powerful browser-based interfaces supported by high-speed networks.
  • Existing data science platforms offer limited interaction during long-running jobs, making them difficult to manage after execution begins.

Goals

  • Provide data science as cloud services;
  • Provide a browser-based GUI to form a workflow without writing code;
  • Allow non-IT people to access data science;
  • Support collaborative data science;
  • Allow users to interact with the execution of a job;
  • Support huge volumes of data efficiently.

Acknowledgements

NIH NIDDK
NSF Logo, symbol, meaning, history, PNG, brand
Yourkit