Getting started with iiDrak Data Platform
A modern Data Lakehouse solution with Open and Unified data processing platform for Data Lake and Data warehouse.
Getting started with iiDrak Data Platform
Overview of the iiDrak Data Platform
The iiDrak is a unified solution that combines the best features of data lakes and data warehouses, providing a single platform for all your data management, processing, and AI/ML needs. It is an On-premise first solution but extensive support for cross cloud integration, cloud-native and cloud-agnostic. Data and executors are totally decoupled. BYOS (Bring your own storage) & BYOC (Bring your own compute)
iiDrak Data platform supports 3 modes of executor:
- Clustered executors - Execute queries using Apache Spark cluster. Useful for querying large datasets
- Single Node executor - Leverage DuckDB on single node to execute lightning fast queries against tables holding medium size data (Ex: 1GB - 100GB)
- Serverless executor - Use browsers WASM capabilities to execute queries inside the browser for running queries against small data sets (Ex: < 10GB data). This could run exploratory queries against CSV files in object storage etc.,
iiDrak Data platform enhances this architecture with:
- Unified data management across structured and unstructured data
- Enterprise-grade security and governance
- Integrated AI/ML capabilities
- Visual tools for data pipeline creation and AI/ML Workflow creation
- Advanced query capabilities with AI assistance
Key Benefits
- Simplified Architecture: Eliminate data silos by managing all data types in one platform. iiDrak + Data Catalog can manage all the data sources within the enterprise.
- Cost Optimization: Separate storage and compute resources for optimal resource utilization
- Flexibility: Deploy anywhere - cloud, on-premise, or hybrid environments
- Enhanced Productivity: Visual tools and AI assistance accelerate development
- Enterprise Governance: Built-in security and compliance features
- Query Federation: Run queries against destination source without moving the data.
System Requirements
Cloud Deployment
Supported Cloud Platforms:
- AWS (recommended: m5.xlarge or equivalent)
- Azure (recommended: Standard_D4s_v3 or equivalent)
- Google Cloud Platform (recommended: n2-standard-4 or equivalent)
Minimum Storage: 1TB for system storage Network: High-bandwidth internet connection (minimum 1 Gbps)
On-Premise Deployment
CPU: 8+ cores (recommended: Intel Xeon or AMD EPYC) RAM: 32GB minimum (recommended: 64GB) Storage:
System: 200GB SSD Data: Based on requirements (recommended: starts at 2TB)
Network: 10 Gbps network interface
Software Requirements
Operating System: Linux (Ubuntu 20.04 LTS or later) RedHat Enterprise Linux 8.x or later
Get Started - Installer
# Download the installer
curl -O https://nexaris.com/installer/iidrak_setup.sh
# Run installation script
./installer