help@rskworld.in +91 93305 39277
RSK World
  • Home
  • Development
    • Web Development
    • Mobile Apps
    • Software
    • Games
    • Project
  • Technologies
    • Data Science
    • AI Development
    • Cloud Development
    • Blockchain
    • Cyber Security
    • Dev Tools
    • Testing Tools
  • About
  • Contact

Theme Settings

Color Scheme
Display Options
Font Size
100%
Back to Project
RSK World
polars-fastdataframes
/
notebooks
RSK World
polars-fastdataframes
High-performance DataFrames with Polars
notebooks
  • 01_basic_operations.ipynb7.1 KB
  • 02_lazy_evaluation.ipynb5.5 KB
  • 03_performance_comparison.ipynb7.2 KB
  • 04_advanced_queries.ipynb45.3 KB
01_basic_operations.ipynb
notebooks/01_basic_operations.ipynb
Raw Download
Find: Go to:
{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Basic Polars DataFrame Operations\n",
        "\n",
        "<!--\n",
        "Author: RSK World\n",
        "Website: https://rskworld.in\n",
        "Email: help@rskworld.in\n",
        "Phone: +91 93305 39277\n",
        "-->\n",
        "\n",
        "This notebook demonstrates fundamental operations with Polars DataFrames, including creating DataFrames, filtering, selecting, grouping, and aggregating data.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Author: RSK World\n",
        "# Website: https://rskworld.in\n",
        "# Email: help@rskworld.in\n",
        "# Phone: +91 93305 39277\n",
        "\n",
        "import polars as pl\n",
        "import numpy as np\n",
        "from datetime import datetime, timedelta\n",
        "\n",
        "print(\"Polars version:\", pl.__version__)\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## 1. Creating a DataFrame\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Create a sample DataFrame\n",
        "df = pl.DataFrame({\n",
        "    'id': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],\n",
        "    'name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve', 'Frank', 'Grace', 'Henry', 'Ivy', 'Jack'],\n",
        "    'age': [25, 30, 35, 28, 32, 27, 29, 31, 26, 33],\n",
        "    'salary': [50000, 60000, 70000, 55000, 65000, 52000, 58000, 62000, 51000, 68000],\n",
        "    'department': ['IT', 'HR', 'IT', 'Finance', 'IT', 'HR', 'Finance', 'IT', 'HR', 'Finance']\n",
        "})\n",
        "\n",
        "print(\"DataFrame shape:\", df.shape)\n",
        "print(\"\\nDataFrame:\")\n",
        "df\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## 2. Selecting Columns\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Select specific columns\n",
        "selected = df.select(['name', 'age', 'salary'])\n",
        "selected\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## 3. Filtering Rows\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Filter rows where age > 30\n",
        "filtered = df.filter(pl.col('age') > 30)\n",
        "print(\"Employees older than 30:\")\n",
        "filtered\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## 4. Adding New Columns\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Add a bonus column (10% of salary)\n",
        "df_with_bonus = df.with_columns([\n",
        "    (pl.col('salary') * 0.1).alias('bonus'),\n",
        "    (pl.col('salary') * 1.1).alias('total_compensation')\n",
        "])\n",
        "df_with_bonus\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## 5. Group By and Aggregate\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Group by department and calculate statistics\n",
        "grouped = df.group_by('department').agg([\n",
        "    pl.col('salary').mean().alias('avg_salary'),\n",
        "    pl.col('salary').max().alias('max_salary'),\n",
        "    pl.col('salary').min().alias('min_salary'),\n",
        "    pl.col('age').mean().alias('avg_age'),\n",
        "    pl.count().alias('employee_count')\n",
        "])\n",
        "grouped\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## 6. Sorting\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Sort by salary in descending order\n",
        "sorted_df = df.sort('salary', descending=True)\n",
        "sorted_df\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## 7. Join Operations\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Create a department details DataFrame\n",
        "dept_df = pl.DataFrame({\n",
        "    'department': ['IT', 'HR', 'Finance'],\n",
        "    'location': ['Building A', 'Building B', 'Building C'],\n",
        "    'budget': [500000, 300000, 400000]\n",
        "})\n",
        "\n",
        "# Left join\n",
        "joined = df.join(dept_df, on='department', how='left')\n",
        "joined\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## 8. Window Functions\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Rank employees by salary within each department\n",
        "ranked = df.with_columns([\n",
        "    pl.col('salary').rank().over('department').alias('rank_in_dept'),\n",
        "    pl.col('salary').max().over('department').alias('dept_max_salary')\n",
        "])\n",
        "ranked\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## 9. String Operations\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# String operations\n",
        "string_ops = df.with_columns([\n",
        "    pl.col('name').str.to_uppercase().alias('name_upper'),\n",
        "    pl.col('name').str.to_lowercase().alias('name_lower'),\n",
        "    pl.col('name').str.len_chars().alias('name_length')\n",
        "])\n",
        "string_ops\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## 10. Date Operations\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Create a DataFrame with dates\n",
        "dates_df = pl.DataFrame({\n",
        "    'date': pl.date_range(datetime(2023, 1, 1), datetime(2023, 1, 10), '1d', eager=True),\n",
        "    'value': range(10)\n",
        "})\n",
        "\n",
        "# Extract date components\n",
        "dates_df = dates_df.with_columns([\n",
        "    pl.col('date').dt.year().alias('year'),\n",
        "    pl.col('date').dt.month().alias('month'),\n",
        "    pl.col('date').dt.day().alias('day'),\n",
        "    pl.col('date').dt.weekday().alias('weekday')\n",
        "])\n",
        "dates_df\n"
      ]
    }
  ],
  "metadata": {
    "language_info": {
      "name": "python"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 2
}
270 lines•7.1 KB
json

About RSK World

Founded by Molla Samser, with Designer & Tester Rima Khatun, RSK World is your one-stop destination for free programming resources, source code, and development tools.

Founder: Molla Samser
Designer & Tester: Rima Khatun

Development

  • Game Development
  • Web Development
  • Mobile Development
  • AI Development
  • Development Tools

Legal

  • Terms & Conditions
  • Privacy Policy
  • Disclaimer

Contact Info

Nutanhat, Mongolkote
Purba Burdwan, West Bengal
India, 713147

+91 93305 39277

hello@rskworld.in
support@rskworld.in

© 2026 RSK World. All rights reserved.

Content used for educational purposes only. View Disclaimer