help@rskworld.in +91 93305 39277
RSK World
  • Home
  • Development
    • Web Development
    • Mobile Apps
    • Software
    • Games
    • Project
  • Technologies
    • Data Science
    • AI Development
    • Cloud Development
    • Blockchain
    • Cyber Security
    • Dev Tools
    • Testing Tools
  • About
  • Contact

Theme Settings

Color Scheme
Display Options
Font Size
100%
Back to Project
RSK World
language-translation
RSK World
language-translation
Language Translation Dataset - Machine Translation + Multilingual NLP + Parallel Corpus + Transformers
language-translation
  • data
  • examples
  • scripts
  • .gitignore705 B
  • CHECK_REPORT.md3.3 KB
  • LICENSE1 KB
  • PROJECT_INFO.md2.6 KB
  • README.md2.8 KB
  • RELEASE_NOTES.md4.7 KB
  • SETUP.md2.6 KB
  • config.py1.3 KB
  • index.html62.4 KB
  • language-translation.zip63.2 KB
  • requirements.txt299 B
index.html
index.html
Raw Download
Find: Go to:
<!DOCTYPE html>
<html lang="en">
<head>
    <!--
        Language Translation Dataset - Demo Page
        Author: RSK World
        Website: https://rskworld.in
        Email: help@rskworld.in
        Phone: +91 93305 39277
        Copyright © 2016 RSK World. All rights reserved.
    -->
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Language Translation Dataset - RSK World</title>
    <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/css/bootstrap.min.css" rel="stylesheet">
    <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.0/css/all.min.css">
    <style>
        :root {
            --primary-color: #dc3545;
            --secondary-color: #6c757d;
        }
        body {
            font-family: 'Google Sans', 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
            background: #f8f9fa;
            min-height: 100vh;
            padding: 0;
            margin: 0;
        }
        .main-container {
            max-width: 1400px;
            margin: 0 auto;
            background: white;
            min-height: 100vh;
        }
        .header {
            background: white;
            border-bottom: 1px solid #e0e0e0;
            padding: 20px 40px;
            text-align: center;
        }
        .header h1 {
            margin: 0;
            font-size: 2rem;
            font-weight: 400;
            color: #1a73e8;
        }
        .header p {
            margin: 10px 0 0;
            color: #5f6368;
            font-size: 0.9rem;
        }
        .content {
            padding: 20px 40px 40px;
        }
        .feature-card {
            border: 1px solid #e0e0e0;
            border-radius: 10px;
            padding: 20px;
            margin-bottom: 20px;
            transition: transform 0.3s, box-shadow 0.3s;
        }
        .feature-card:hover {
            transform: translateY(-5px);
            box-shadow: 0 5px 20px rgba(0,0,0,0.1);
        }
        .feature-icon {
            font-size: 2rem;
            color: var(--primary-color);
            margin-bottom: 15px;
        }
        .dataset-preview {
            background: #f8f9fa;
            border-radius: 10px;
            padding: 20px;
            margin: 20px 0;
        }
        .table-responsive {
            border-radius: 8px;
            overflow: hidden;
        }
        .badge-custom {
            padding: 8px 15px;
            border-radius: 20px;
            font-weight: 500;
        }
        .footer {
            background: #2c3e50;
            color: white;
            padding: 30px;
            text-align: center;
        }
        .footer a {
            color: #3498db;
            text-decoration: none;
        }
        .btn-download {
            background: var(--primary-color);
            color: white;
            padding: 12px 30px;
            border-radius: 25px;
            border: none;
            font-weight: 600;
            transition: all 0.3s;
        }
        .btn-download:hover {
            background: #c82333;
            transform: scale(1.05);
        }
        /* Google Translate Style */
        .translate-container {
            background: white;
            border-radius: 8px;
            box-shadow: 0 2px 8px rgba(0,0,0,0.1);
            margin: 20px 0;
            overflow: hidden;
        }
        .translate-header {
            display: flex;
            align-items: center;
            justify-content: space-between;
            padding: 16px 20px;
            border-bottom: 1px solid #e0e0e0;
            background: #f8f9fa;
        }
        .language-selector {
            display: flex;
            align-items: center;
            gap: 12px;
            flex: 1;
        }
        .lang-select {
            flex: 1;
            padding: 10px 12px;
            border: 1px solid #dadce0;
            border-radius: 4px;
            font-size: 14px;
            background: white;
            cursor: pointer;
            transition: border-color 0.2s;
        }
        .lang-select:focus {
            outline: none;
            border-color: #1a73e8;
        }
        .swap-btn {
            background: transparent;
            border: 1px solid #dadce0;
            border-radius: 50%;
            width: 40px;
            height: 40px;
            display: flex;
            align-items: center;
            justify-content: center;
            cursor: pointer;
            transition: all 0.2s;
            color: #5f6368;
        }
        .swap-btn:hover {
            background: #f1f3f4;
            border-color: #1a73e8;
            color: #1a73e8;
        }
        .translate-body {
            display: grid;
            grid-template-columns: 1fr 1fr;
            gap: 0;
            border-top: 1px solid #e0e0e0;
        }
        .translate-box {
            position: relative;
            border-right: 1px solid #e0e0e0;
        }
        .translate-box:last-child {
            border-right: none;
        }
        .translate-box-header {
            display: flex;
            align-items: center;
            justify-content: space-between;
            padding: 12px 16px;
            background: #f8f9fa;
            border-bottom: 1px solid #e0e0e0;
        }
        .lang-label {
            font-size: 13px;
            color: #5f6368;
            font-weight: 500;
        }
        .box-actions {
            display: flex;
            gap: 8px;
        }
        .action-btn {
            background: transparent;
            border: none;
            padding: 6px;
            cursor: pointer;
            color: #5f6368;
            border-radius: 4px;
            transition: background 0.2s;
            display: flex;
            align-items: center;
            justify-content: center;
        }
        .action-btn:hover {
            background: #e8eaed;
        }
        .translate-textarea {
            width: 100%;
            min-height: 200px;
            border: none;
            padding: 16px;
            font-size: 16px;
            resize: none;
            font-family: inherit;
            line-height: 1.5;
            color: #202124;
        }
        .translate-textarea:focus {
            outline: none;
        }
        .translate-textarea::placeholder {
            color: #9aa0a6;
        }
        .translate-output {
            min-height: 200px;
            padding: 16px;
            font-size: 16px;
            line-height: 1.5;
            color: #202124;
            background: white;
        }
        .translate-footer {
            display: flex;
            align-items: center;
            justify-content: space-between;
            padding: 12px 16px;
            background: #f8f9fa;
            border-top: 1px solid #e0e0e0;
            font-size: 12px;
            color: #5f6368;
        }
        .char-count {
            color: #5f6368;
        }
        .detect-lang {
            color: #1a73e8;
            cursor: pointer;
            text-decoration: underline;
        }
        .detect-lang:hover {
            text-decoration: none;
        }
        @media (max-width: 768px) {
            .translate-body {
                grid-template-columns: 1fr;
            }
            .translate-box {
                border-right: none;
                border-bottom: 1px solid #e0e0e0;
            }
        }
    </style>
</head>
<body>
    <div class="main-container">
        <div class="header">
            <h1><i class="fas fa-file-alt"></i> Language Translation Dataset</h1>
            <p>Parallel corpus dataset with sentence pairs in multiple languages</p>
        </div>
        
        <div class="content">
            <div class="row mb-4">
                <div class="col-md-12">
                    <h2 class="mb-3">About This Dataset</h2>
                    <p class="lead">
                        This dataset contains parallel sentence pairs in multiple languages with aligned translations. 
                        Perfect for machine translation, multilingual NLP, and cross-lingual model training.
                    </p>
                </div>
            </div>

            <div class="row mb-4">
                <div class="col-md-6">
                    <div class="feature-card">
                        <div class="feature-icon">
                            <i class="fas fa-language"></i>
                        </div>
                        <h4>Multiple Language Pairs</h4>
                        <p>Parallel sentences in multiple languages with aligned translations for comprehensive training.</p>
                    </div>
                </div>
                <div class="col-md-6">
                    <div class="feature-card">
                        <div class="feature-icon">
                            <i class="fas fa-align-center"></i>
                        </div>
                        <h4>Aligned Translations</h4>
                        <p>Precisely aligned sentence pairs ensuring accurate translation model training.</p>
                    </div>
                </div>
                <div class="col-md-6">
                    <div class="feature-card">
                        <div class="feature-icon">
                            <i class="fas fa-database"></i>
                        </div>
                        <h4>Training & Validation Sets</h4>
                        <p>Pre-split datasets ready for immediate use in machine learning pipelines.</p>
                    </div>
                </div>
                <div class="col-md-6">
                    <div class="feature-card">
                        <div class="feature-icon">
                            <i class="fas fa-robot"></i>
                        </div>
                        <h4>Ready for Translation Models</h4>
                        <p>Optimized format compatible with Transformers, mBERT, and mT5 models.</p>
                    </div>
                </div>
            </div>

            <div class="row mb-4">
                <div class="col-md-12">
                    <h3 class="mb-3">Technologies</h3>
                    <span class="badge badge-custom bg-primary me-2 mb-2">TSV</span>
                    <span class="badge badge-custom bg-success me-2 mb-2">JSON</span>
                    <span class="badge badge-custom bg-info me-2 mb-2">Transformers</span>
                    <span class="badge badge-custom bg-warning me-2 mb-2">mBERT</span>
                    <span class="badge badge-custom bg-danger me-2 mb-2">mT5</span>
                </div>
            </div>

            <!-- How It Works Section -->
            <div class="row mb-4">
                <div class="col-md-12">
                    <div class="feature-card" style="background: #f8f9fa;">
                        <h3 class="mb-4"><i class="fas fa-info-circle text-primary"></i> How It Works - Complete Guide</h3>
                        
                        <div class="accordion" id="howItWorksAccordion">
                            <!-- Translation Interface -->
                            <div class="accordion-item mb-3">
                                <h2 class="accordion-header">
                                    <button class="accordion-button" type="button" data-bs-toggle="collapse" data-bs-target="#collapse1">
                                        <i class="fas fa-language me-2"></i> Translation Interface - Step by Step Guide
                                    </button>
                                </h2>
                                <div id="collapse1" class="accordion-collapse collapse show" data-bs-parent="#howItWorksAccordion">
                                    <div class="accordion-body">
                                        <h5>Using the Translation Tool:</h5>
                                        <ol>
                                            <li><strong>Select Languages:</strong> Choose your source language (or use "Detect language" for auto-detection) and target language from the dropdown menus.</li>
                                            <li><strong>Enter Text:</strong> Type or paste any word, phrase, or sentence in the left text box.</li>
                                            <li><strong>Auto-Translation:</strong> Translation happens automatically as you type (with 500ms delay for better performance).</li>
                                            <li><strong>View Result:</strong> The translated text appears instantly in the right box.</li>
                                            <li><strong>Swap Languages:</strong> Click the swap button (↔) to reverse the translation direction.</li>
                                            <li><strong>Copy Text:</strong> Use the copy buttons to copy input or output text to clipboard.</li>
                                            <li><strong>Listen:</strong> Click the speaker icon to hear the translated text (text-to-speech).</li>
                                            <li><strong>Clear:</strong> Use the X button to clear the input field.</li>
                                        </ol>
                                        <p class="mt-3"><strong>Supported Languages:</strong> English, Spanish, French, German</p>
                                    </div>
                                </div>
                            </div>

                            <!-- Translation System -->
                            <div class="accordion-item mb-3">
                                <h2 class="accordion-header">
                                    <button class="accordion-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapse2">
                                        <i class="fas fa-cogs me-2"></i> Translation System Architecture
                                    </button>
                                </h2>
                                <div id="collapse2" class="accordion-collapse collapse" data-bs-parent="#howItWorksAccordion">
                                    <div class="accordion-body">
                                        <h5>Three-Tier Translation System:</h5>
                                        <div class="row mt-3">
                                            <div class="col-md-4 mb-3">
                                                <div class="card h-100">
                                                    <div class="card-body">
                                                        <h6 class="card-title text-primary"><i class="fas fa-database"></i> Tier 1: Local Dictionary</h6>
                                                        <p class="card-text small">First, the system checks the local dictionary with <strong>1,983 translation entries</strong>. This works completely offline!</p>
                                                        <ul class="small">
                                                            <li>Instant translation</li>
                                                            <li>No internet required</li>
                                                            <li>Exact phrase matching</li>
                                                        </ul>
                                                    </div>
                                                </div>
                                            </div>
                                            <div class="col-md-4 mb-3">
                                                <div class="card h-100">
                                                    <div class="card-body">
                                                        <h6 class="card-title text-success"><i class="fas fa-puzzle-piece"></i> Tier 2: Word-by-Word</h6>
                                                        <p class="card-text small">If exact match not found, the system translates word-by-word using the local dictionary.</p>
                                                        <ul class="small">
                                                            <li>Better coverage</li>
                                                            <li>Handles new combinations</li>
                                                            <li>Still works offline</li>
                                                        </ul>
                                                    </div>
                                                </div>
                                            </div>
                                            <div class="col-md-4 mb-3">
                                                <div class="card h-100">
                                                    <div class="card-body">
                                                        <h6 class="card-title text-warning"><i class="fas fa-cloud"></i> Tier 3: API Fallback</h6>
                                                        <p class="card-text small">As a last resort, uses MyMemory Translation API for real-time translation.</p>
                                                        <ul class="small">
                                                            <li>Requires internet</li>
                                                            <li>Handles any text</li>
                                                            <li>Real-time translation</li>
                                                        </ul>
                                                    </div>
                                                </div>
                                            </div>
                                        </div>
                                        <p class="mt-3"><strong>Translation Status:</strong> The footer shows which method was used (Local Dictionary, Word-by-word, or API).</p>
                                    </div>
                                </div>
                            </div>

                            <!-- Local Dictionary -->
                            <div class="accordion-item mb-3">
                                <h2 class="accordion-header">
                                    <button class="accordion-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapse3">
                                        <i class="fas fa-book me-2"></i> Local Dictionary Details
                                    </button>
                                </h2>
                                <div id="collapse3" class="accordion-collapse collapse" data-bs-parent="#howItWorksAccordion">
                                    <div class="accordion-body">
                                        <h5>Comprehensive Offline Translation Dictionary</h5>
                                        <p>The local dictionary contains <strong>1,983 translation entries</strong> covering:</p>
                                        <div class="row">
                                            <div class="col-md-6">
                                                <h6>Content Categories:</h6>
                                                <ul>
                                                    <li>Greetings & Common Phrases</li>
                                                    <li>Numbers & Dates</li>
                                                    <li>Days of Week & Months</li>
                                                    <li>Food & Drinks</li>
                                                    <li>Family & Relationships</li>
                                                    <li>Colors & Descriptions</li>
                                                    <li>Time & Places</li>
                                                    <li>Actions & Verbs</li>
                                                    <li>Technology Terms</li>
                                                    <li>Travel & Transportation</li>
                                                    <li>Business & Education</li>
                                                    <li>And much more!</li>
                                                </ul>
                                            </div>
                                            <div class="col-md-6">
                                                <h6>Language Pairs (12 total):</h6>
                                                <ul>
                                                    <li>English ↔ Spanish</li>
                                                    <li>English ↔ French</li>
                                                    <li>English ↔ German</li>
                                                    <li>Spanish ↔ French</li>
                                                    <li>Spanish ↔ German</li>
                                                    <li>French ↔ German</li>
                                                </ul>
                                                <p class="mt-3"><strong>File Location:</strong> <code>data/local_dictionary.json</code></p>
                                                <p><strong>Format:</strong> JSON with nested dictionaries for each language pair</p>
                                            </div>
                                        </div>
                                    </div>
                                </div>
                            </div>

                            <!-- Features -->
                            <div class="accordion-item mb-3">
                                <h2 class="accordion-header">
                                    <button class="accordion-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapse4">
                                        <i class="fas fa-star me-2"></i> Key Features & Capabilities
                                    </button>
                                </h2>
                                <div id="collapse4" class="accordion-collapse collapse" data-bs-parent="#howItWorksAccordion">
                                    <div class="accordion-body">
                                        <div class="row">
                                            <div class="col-md-6">
                                                <h6><i class="fas fa-check-circle text-success"></i> Core Features:</h6>
                                                <ul>
                                                    <li><strong>Real-time Translation:</strong> Translates as you type (500ms debounce)</li>
                                                    <li><strong>Auto Language Detection:</strong> Automatically detects source language</li>
                                                    <li><strong>Offline Support:</strong> Works without internet using local dictionary</li>
                                                    <li><strong>Word-by-Word Translation:</strong> Handles phrases not in exact dictionary</li>
                                                    <li><strong>Character Counter:</strong> Shows 0/5000 with limit warning</li>
                                                    <li><strong>Copy to Clipboard:</strong> Easy copy buttons for input/output</li>
                                                    <li><strong>Text-to-Speech:</strong> Listen to translations in target language</li>
                                                    <li><strong>Language Swap:</strong> One-click language direction reversal</li>
                                                </ul>
                                            </div>
                                            <div class="col-md-6">
                                                <h6><i class="fas fa-magic text-primary"></i> Advanced Features:</h6>
                                                <ul>
                                                    <li><strong>Smart Matching:</strong> Handles punctuation and case variations</li>
                                                    <li><strong>Status Indicators:</strong> Shows translation source (Local/API)</li>
                                                    <li><strong>Error Handling:</strong> Graceful fallbacks if translation fails</li>
                                                    <li><strong>Responsive Design:</strong> Works on desktop, tablet, and mobile</li>
                                                    <li><strong>Google Translate UI:</strong> Familiar, user-friendly interface</li>
                                                    <li><strong>Toast Notifications:</strong> Visual feedback for user actions</li>
                                                    <li><strong>Loading States:</strong> Shows spinner during translation</li>
                                                    <li><strong>Keyboard Shortcuts:</strong> Ctrl+Enter to translate</li>
                                                </ul>
                                            </div>
                                        </div>
                                    </div>
                                </div>
                            </div>

                            <!-- Dataset Information -->
                            <div class="accordion-item mb-3">
                                <h2 class="accordion-header">
                                    <button class="accordion-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapse5">
                                        <i class="fas fa-database me-2"></i> Dataset Information
                                    </button>
                                </h2>
                                <div id="collapse5" class="accordion-collapse collapse" data-bs-parent="#howItWorksAccordion">
                                    <div class="accordion-body">
                                        <h5>Complete Dataset Structure:</h5>
                                        <table class="table table-bordered mt-3">
                                            <thead class="table-light">
                                                <tr>
                                                    <th>File</th>
                                                    <th>Format</th>
                                                    <th>Entries</th>
                                                    <th>Purpose</th>
                                                </tr>
                                            </thead>
                                            <tbody>
                                                <tr>
                                                    <td><code>train.json</code> / <code>train.tsv</code></td>
                                                    <td>JSON / TSV</td>
                                                    <td>50</td>
                                                    <td>Training dataset with parallel sentences</td>
                                                </tr>
                                                <tr>
                                                    <td><code>validation.json</code> / <code>validation.tsv</code></td>
                                                    <td>JSON / TSV</td>
                                                    <td>5</td>
                                                    <td>Validation dataset for model testing</td>
                                                </tr>
                                                <tr>
                                                    <td><code>sample_data.json</code></td>
                                                    <td>JSON</td>
                                                    <td>15</td>
                                                    <td>Sample data for preview/demo</td>
                                                </tr>
                                                <tr>
                                                    <td><code>local_dictionary.json</code></td>
                                                    <td>JSON</td>
                                                    <td>1,983</td>
                                                    <td>Comprehensive offline translation dictionary</td>
                                                </tr>
                                            </tbody>
                                        </table>
                                        <p class="mt-3"><strong>Total Translation Entries:</strong> 2,053 (50 + 5 + 15 + 1,983)</p>
                                        <p><strong>Languages Covered:</strong> English, Spanish, French, German (4 languages, 12 pairs)</p>
                                    </div>
                                </div>
                            </div>

                            <!-- Technical Details -->
                            <div class="accordion-item mb-3">
                                <h2 class="accordion-header">
                                    <button class="accordion-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapse6">
                                        <i class="fas fa-code me-2"></i> Technical Implementation
                                    </button>
                                </h2>
                                <div id="collapse6" class="accordion-collapse collapse" data-bs-parent="#howItWorksAccordion">
                                    <div class="accordion-body">
                                        <h5>Technology Stack:</h5>
                                        <div class="row mt-3">
                                            <div class="col-md-6">
                                                <h6>Frontend:</h6>
                                                <ul>
                                                    <li><strong>HTML5:</strong> Semantic markup</li>
                                                    <li><strong>CSS3:</strong> Custom styling with Google Translate-inspired design</li>
                                                    <li><strong>JavaScript (ES6+):</strong> Vanilla JS, no frameworks</li>
                                                    <li><strong>Bootstrap 5:</strong> Responsive grid and components</li>
                                                    <li><strong>Font Awesome 6:</strong> Icons</li>
                                                </ul>
                                            </div>
                                            <div class="col-md-6">
                                                <h6>Backend/Data:</h6>
                                                <ul>
                                                    <li><strong>JSON:</strong> Data storage format</li>
                                                    <li><strong>TSV:</strong> Tab-separated values for easy processing</li>
                                                    <li><strong>Python 3:</strong> Data processing scripts</li>
                                                    <li><strong>MyMemory API:</strong> Free translation API fallback</li>
                                                    <li><strong>Web Speech API:</strong> Text-to-speech functionality</li>
                                                </ul>
                                            </div>
                                        </div>
                                        <h6 class="mt-4">Key Functions:</h6>
                                        <ul>
                                            <li><code>loadLocalDictionary()</code> - Loads offline dictionary</li>
                                            <li><code>loadTranslationData()</code> - Loads dataset translations</li>
                                            <li><code>translateText()</code> - Main translation function</li>
                                            <li><code>translateWordByWord()</code> - Word-by-word translation</li>
                                            <li><code>translateWithAPI()</code> - API fallback translation</li>
                                            <li><code>detectLanguage()</code> - Auto language detection</li>
                                            <li><code>handleInput()</code> - Auto-translate on input</li>
                                        </ul>
                                    </div>
                                </div>
                            </div>

                            <!-- Usage Tips -->
                            <div class="accordion-item mb-3">
                                <h2 class="accordion-header">
                                    <button class="accordion-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapse7">
                                        <i class="fas fa-lightbulb me-2"></i> Usage Tips & Best Practices
                                    </button>
                                </h2>
                                <div id="collapse7" class="accordion-collapse collapse" data-bs-parent="#howItWorksAccordion">
                                    <div class="accordion-body">
                                        <h5>Tips for Best Results:</h5>
                                        <ol>
                                            <li><strong>Use Complete Sentences:</strong> Full sentences translate better than single words</li>
                                            <li><strong>Check Status Indicator:</strong> See if translation came from local dictionary (faster) or API</li>
                                            <li><strong>Offline Mode:</strong> Most common phrases work offline - no internet needed!</li>
                                            <li><strong>Language Detection:</strong> Use "Detect language" if unsure of source language</li>
                                            <li><strong>Character Limit:</strong> Maximum 5,000 characters per translation</li>
                                            <li><strong>Copy & Paste:</strong> Easy copy buttons for both input and output</li>
                                            <li><strong>Listen Feature:</strong> Use speaker icon to hear pronunciation</li>
                                            <li><strong>Swap Languages:</strong> Quickly reverse translation direction</li>
                                        </ol>
                                        <h6 class="mt-3">Common Use Cases:</h6>
                                        <ul>
                                            <li>Learning new languages</li>
                                            <li>Quick phrase translation</li>
                                            <li>Understanding foreign text</li>
                                            <li>Travel communication</li>
                                            <li>Language practice</li>
                                            <li>Document translation</li>
                                        </ul>
                                    </div>
                                </div>
                            </div>

                            <!-- Project Info -->
                            <div class="accordion-item mb-3">
                                <h2 class="accordion-header">
                                    <button class="accordion-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapse8">
                                        <i class="fas fa-info me-2"></i> Project Information
                                    </button>
                                </h2>
                                <div id="collapse8" class="accordion-collapse collapse" data-bs-parent="#howItWorksAccordion">
                                    <div class="accordion-body">
                                        <h5>About This Project:</h5>
                                        <p><strong>Project ID:</strong> 25</p>
                                        <p><strong>Category:</strong> Text Data</p>
                                        <p><strong>Difficulty:</strong> Advanced</p>
                                        <p><strong>Year:</strong> 2016</p>
                                        <p><strong>Technologies:</strong> TSV, JSON, Transformers, mBERT, mT5</p>
                                        
                                        <h6 class="mt-4">Project Structure:</h6>
                                        <pre class="bg-light p-3 rounded"><code>language-translation/
├── data/              # Dataset files (JSON, TSV)
├── scripts/           # Python processing scripts
├── examples/          # Usage examples
├── index.html         # Main demo page
└── Documentation/     # README, SETUP, etc.</code></pre>
                                        
                                        <h6 class="mt-4">Available Scripts:</h6>
                                        <ul>
                                            <li><code>process_data.py</code> - Process and convert datasets</li>
                                            <li><code>convert_format.py</code> - Convert between TSV and JSON</li>
                                            <li><code>analyze_dataset.py</code> - Analyze dataset statistics</li>
                                            <li><code>download_translation_data.py</code> - Download from public sources</li>
                                            <li><code>build_local_dictionary.py</code> - Build local dictionary</li>
                                        </ul>
                                        
                                        <p class="mt-3"><strong>Created by:</strong> RSK World</p>
                                        <p><strong>Website:</strong> <a href="https://rskworld.in" target="_blank">https://rskworld.in</a></p>
                                        <p><strong>Email:</strong> help@rskworld.in</p>
                                        <p><strong>Phone:</strong> +91 93305 39277</p>
                                    </div>
                                </div>
                            </div>
                        </div>
                    </div>
                </div>
            </div>

            <!-- Google Translate Style Interface -->
            <div class="translate-container">
                <div class="translate-header">
                    <div class="language-selector">
                        <select id="fromLang" class="lang-select">
                            <option value="auto">Detect language</option>
                            <option value="en" selected>English</option>
                            <option value="es">Spanish</option>
                            <option value="fr">French</option>
                            <option value="de">German</option>
                        </select>
                        <button class="swap-btn" onclick="swapLanguages()" title="Swap languages">
                            <i class="fas fa-exchange-alt"></i>
                        </button>
                        <select id="toLang" class="lang-select">
                            <option value="es" selected>Spanish</option>
                            <option value="en">English</option>
                            <option value="fr">French</option>
                            <option value="de">German</option>
                        </select>
                    </div>
                </div>
                
                <div class="translate-body">
                    <div class="translate-box">
                        <div class="translate-box-header">
                            <span class="lang-label" id="fromLangLabel">English</span>
                            <div class="box-actions">
                                <button class="action-btn" onclick="clearInput()" title="Clear">
                                    <i class="fas fa-times"></i>
                                </button>
                                <button class="action-btn" onclick="copyInput()" title="Copy">
                                    <i class="fas fa-copy"></i>
                                </button>
                            </div>
                        </div>
                        <textarea id="inputText" class="translate-textarea" placeholder="Enter text" oninput="handleInput()"></textarea>
                        <div class="translate-footer">
                            <span class="char-count" id="charCount">0 / 5000</span>
                            <span class="detect-lang" id="detectedLang" style="display:none;"></span>
                        </div>
                    </div>
                    
                    <div class="translate-box">
                        <div class="translate-box-header">
                            <span class="lang-label" id="toLangLabel">Spanish</span>
                            <div class="box-actions">
                                <button class="action-btn" onclick="copyOutput()" title="Copy">
                                    <i class="fas fa-copy"></i>
                                </button>
                                <button class="action-btn" onclick="speakOutput()" title="Listen">
                                    <i class="fas fa-volume-up"></i>
                                </button>
                            </div>
                        </div>
                        <div id="outputText" class="translate-output">
                            <span style="color: #9aa0a6;">Translation</span>
                        </div>
                        <div class="translate-footer">
                            <span id="translationStatus" style="color: #5f6368; font-size: 12px;"></span>
                        </div>
                    </div>
                </div>
            </div>

            <div class="row mb-4">
                <div class="col-md-12">
                    <h3 class="mb-3">Dataset Preview</h3>
                    <div class="dataset-preview">
                        <div class="table-responsive">
                            <table class="table table-striped table-hover">
                                <thead class="table-dark">
                                    <tr>
                                        <th>ID</th>
                                        <th>English</th>
                                        <th>Spanish</th>
                                        <th>French</th>
                                        <th>German</th>
                                    </tr>
                                </thead>
                                <tbody id="datasetTable">
                                    <!-- Data will be loaded here -->
                                </tbody>
                            </table>
                        </div>
                    </div>
                </div>
            </div>

            <div class="row mb-4">
                <div class="col-md-12 text-center">
                    <a href="./language-translation.zip" class="btn btn-download btn-lg" download>
                        <i class="fas fa-download"></i> Download Dataset
                    </a>
                </div>
            </div>

            <div class="row">
                <div class="col-md-12">
                    <h3 class="mb-3">Dataset Features</h3>
                    <ul class="list-group">
                        <li class="list-group-item"><i class="fas fa-check text-success me-2"></i> Parallel sentences</li>
                        <li class="list-group-item"><i class="fas fa-check text-success me-2"></i> Multiple language pairs</li>
                        <li class="list-group-item"><i class="fas fa-check text-success me-2"></i> Aligned translations</li>
                        <li class="list-group-item"><i class="fas fa-check text-success me-2"></i> Training and validation sets</li>
                        <li class="list-group-item"><i class="fas fa-check text-success me-2"></i> Ready for translation models</li>
                    </ul>
                </div>
            </div>
        </div>

        <div class="footer">
            <p class="mb-2">
                <strong>Language Translation Dataset</strong> - Created by <a href="https://rskworld.in" target="_blank">RSK World</a>
            </p>
            <p class="mb-0">
                <small>
                    Email: <a href="mailto:help@rskworld.in">help@rskworld.in</a> | 
                    Phone: <a href="tel:+919330539277">+91 93305 39277</a> | 
                    Website: <a href="https://rskworld.in" target="_blank">rskworld.in</a>
                </small>
            </p>
            <p class="mt-2 mb-0">
                <small>Copyright © 2016 RSK World. All rights reserved.</small>
            </p>
        </div>
    </div>

    <script src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/js/bootstrap.bundle.min.js"></script>
    <script>
        // Translation dictionary for common words and phrases
        const translationDict = {
            'en-es': {}, 'en-fr': {}, 'en-de': {},
            'es-en': {}, 'es-fr': {}, 'es-de': {},
            'fr-en': {}, 'fr-es': {}, 'fr-de': {},
            'de-en': {}, 'de-es': {}, 'de-fr': {}
        };

        const languageNames = {
            'en': 'English',
            'es': 'Spanish',
            'fr': 'French',
            'de': 'German',
            'auto': 'Detect language'
        };

        let translateTimeout;
        let detectedLanguage = 'en';

        // Load local dictionary first (offline)
        async function loadLocalDictionary() {
            try {
                const response = await fetch('data/local_dictionary.json');
                const localDict = await response.json();
                
                // Merge local dictionary into translationDict
                for (const [key, value] of Object.entries(localDict)) {
                    if (translationDict[key]) {
                        Object.assign(translationDict[key], value);
                    }
                }
                console.log('[OK] Loaded local dictionary');
            } catch (error) {
                console.error('Error loading local dictionary:', error);
            }
        }

        // Load translation data from dataset
        async function loadTranslationData() {
            try {
                // First load local dictionary
                await loadLocalDictionary();
                
                // Then load from train.json
                const response = await fetch('data/train.json');
                const data = await response.json();
                
                data.forEach(item => {
                    // English to others
                    if (item.english && item.spanish) {
                        const key = item.english.toLowerCase();
                        if (!translationDict['en-es'][key]) {
                            translationDict['en-es'][key] = item.spanish;
                        }
                    }
                    if (item.english && item.french) {
                        const key = item.english.toLowerCase();
                        if (!translationDict['en-fr'][key]) {
                            translationDict['en-fr'][key] = item.french;
                        }
                    }
                    if (item.english && item.german) {
                        const key = item.english.toLowerCase();
                        if (!translationDict['en-de'][key]) {
                            translationDict['en-de'][key] = item.german;
                        }
                    }
                    
                    // Spanish to others
                    if (item.spanish && item.english) {
                        const key = item.spanish.toLowerCase();
                        if (!translationDict['es-en'][key]) {
                            translationDict['es-en'][key] = item.english;
                        }
                    }
                    if (item.spanish && item.french) {
                        const key = item.spanish.toLowerCase();
                        if (!translationDict['es-fr'][key]) {
                            translationDict['es-fr'][key] = item.french;
                        }
                    }
                    if (item.spanish && item.german) {
                        const key = item.spanish.toLowerCase();
                        if (!translationDict['es-de'][key]) {
                            translationDict['es-de'][key] = item.german;
                        }
                    }
                    
                    // French to others
                    if (item.french && item.english) {
                        const key = item.french.toLowerCase();
                        if (!translationDict['fr-en'][key]) {
                            translationDict['fr-en'][key] = item.english;
                        }
                    }
                    if (item.french && item.spanish) {
                        const key = item.french.toLowerCase();
                        if (!translationDict['fr-es'][key]) {
                            translationDict['fr-es'][key] = item.spanish;
                        }
                    }
                    if (item.french && item.german) {
                        const key = item.french.toLowerCase();
                        if (!translationDict['fr-de'][key]) {
                            translationDict['fr-de'][key] = item.german;
                        }
                    }
                    
                    // German to others
                    if (item.german && item.english) {
                        const key = item.german.toLowerCase();
                        if (!translationDict['de-en'][key]) {
                            translationDict['de-en'][key] = item.english;
                        }
                    }
                    if (item.german && item.spanish) {
                        const key = item.german.toLowerCase();
                        if (!translationDict['de-es'][key]) {
                            translationDict['de-es'][key] = item.spanish;
                        }
                    }
                    if (item.german && item.french) {
                        const key = item.german.toLowerCase();
                        if (!translationDict['de-fr'][key]) {
                            translationDict['de-fr'][key] = item.french;
                        }
                    }
                });
                console.log('[OK] Loaded dataset translations');
            } catch (error) {
                console.error('Error loading translation data:', error);
            }
        }

        // Detect language using API
        async function detectLanguage(text) {
            if (!text.trim()) return 'en';
            try {
                const response = await fetch(`https://api.mymemory.translated.net/get?q=${encodeURIComponent(text)}&langpair=auto|en`);
                const data = await response.json();
                if (data.responseStatus === 200 && data.responseData) {
                    return data.responseData.detectedSourceLanguage || 'en';
                }
            } catch (error) {
                console.error('Language detection error:', error);
            }
            return 'en';
        }

        // Translate using API (MyMemory Translation API - free)
        async function translateWithAPI(text, fromLang, toLang) {
            if (!text.trim()) return null;
            
            try {
                const url = `https://api.mymemory.translated.net/get?q=${encodeURIComponent(text)}&langpair=${fromLang}|${toLang}`;
                const response = await fetch(url);
                const data = await response.json();
                
                if (data.responseStatus === 200 && data.responseData) {
                    return data.responseData.translatedText;
                }
            } catch (error) {
                console.error('API translation error:', error);
            }
            return null;
        }

        // Update character count
        function updateCharCount() {
            const inputText = document.getElementById('inputText').value;
            const charCount = document.getElementById('charCount');
            charCount.textContent = `${inputText.length} / 5000`;
            if (inputText.length > 5000) {
                charCount.style.color = '#ea4335';
            } else {
                charCount.style.color = '#5f6368';
            }
        }

        // Handle input with auto-translate
        function handleInput() {
            updateCharCount();
            const inputText = document.getElementById('inputText').value.trim();
            const outputDiv = document.getElementById('outputText');
            
            if (!inputText) {
                outputDiv.innerHTML = '<span style="color: #9aa0a6;">Translation</span>';
                document.getElementById('translationStatus').textContent = '';
                document.getElementById('detectedLang').style.display = 'none';
                return;
            }

            // Clear previous timeout
            clearTimeout(translateTimeout);
            
            // Show loading
            outputDiv.innerHTML = '<span style="color: #9aa0a6;"><i class="fas fa-spinner fa-spin"></i> Translating...</span>';
            
            // Auto-translate after 500ms delay (debounce)
            translateTimeout = setTimeout(() => {
                translateText();
            }, 500);
        }

        // Translate word by word for better coverage
        function translateWordByWord(text, fromLang, toLang) {
            const words = text.toLowerCase().split(/\s+/);
            const dictKey = `${fromLang}-${toLang}`;
            const dict = translationDict[dictKey] || {};
            
            const translatedWords = words.map(word => {
                // Try exact match
                if (dict[word]) {
                    return dict[word];
                }
                // Try without punctuation
                const cleanWord = word.replace(/[.,!?;:]/g, '');
                if (dict[cleanWord]) {
                    return dict[cleanWord];
                }
                // Return original if not found
                return word;
            });
            
            return translatedWords.join(' ');
        }

        // Main translation function
        async function translateText() {
            const inputText = document.getElementById('inputText').value.trim();
            let fromLang = document.getElementById('fromLang').value;
            const toLang = document.getElementById('toLang').value;
            const outputDiv = document.getElementById('outputText');
            const statusDiv = document.getElementById('translationStatus');
            const detectedLangSpan = document.getElementById('detectedLang');
            
            if (!inputText) {
                outputDiv.innerHTML = '<span style="color: #9aa0a6;">Translation</span>';
                statusDiv.textContent = '';
                return;
            }
            
            // Auto-detect language if selected
            if (fromLang === 'auto') {
                detectedLanguage = await detectLanguage(inputText);
                fromLang = detectedLanguage;
                detectedLangSpan.textContent = `Detected: ${languageNames[detectedLanguage]}`;
                detectedLangSpan.style.display = 'inline';
            } else {
                detectedLangSpan.style.display = 'none';
            }
            
            // Check local dictionary first (offline)
            const dictKey = `${fromLang}-${toLang}`;
            const lowerText = inputText.toLowerCase();
            let translation = null;
            let source = '';
            
            // Try exact match first
            if (translationDict[dictKey] && translationDict[dictKey][lowerText]) {
                translation = translationDict[dictKey][lowerText];
                source = 'Local Dictionary';
            } else {
                // Try word-by-word translation
                const wordByWord = translateWordByWord(inputText, fromLang, toLang);
                if (wordByWord !== lowerText) {
                    translation = wordByWord;
                    source = 'Local Dictionary (Word-by-word)';
                } else {
                    // Try API translation as fallback
                    translation = await translateWithAPI(inputText, fromLang, toLang);
                    if (translation) {
                        source = 'API';
                    } else {
                        translation = 'Translation not available. Please try again.';
                        source = 'Not Available';
                    }
                }
            }
            
            outputDiv.innerHTML = translation;
            statusDiv.textContent = source ? `Translation from ${source}` : '';
        }

        // Swap languages
        function swapLanguages() {
            const fromLang = document.getElementById('fromLang');
            const toLang = document.getElementById('toLang');
            const inputText = document.getElementById('inputText');
            const outputText = document.getElementById('outputText');
            
            const tempLang = fromLang.value;
            fromLang.value = toLang.value === 'auto' ? 'en' : toLang.value;
            toLang.value = tempLang === 'auto' ? 'en' : tempLang;
            
            updateLanguageLabels();
            
            const tempText = inputText.value;
            const translatedText = outputText.textContent.replace('Translation', '').trim();
            inputText.value = translatedText;
            outputText.innerHTML = '<span style="color: #9aa0a6;">Translation</span>';
            
            if (inputText.value) {
                handleInput();
            }
        }

        // Update language labels
        function updateLanguageLabels() {
            const fromLang = document.getElementById('fromLang').value;
            const toLang = document.getElementById('toLang').value;
            document.getElementById('fromLangLabel').textContent = languageNames[fromLang] || 'Unknown';
            document.getElementById('toLangLabel').textContent = languageNames[toLang] || 'Unknown';
        }

        // Clear input
        function clearInput() {
            document.getElementById('inputText').value = '';
            document.getElementById('outputText').innerHTML = '<span style="color: #9aa0a6;">Translation</span>';
            document.getElementById('charCount').textContent = '0 / 5000';
            document.getElementById('translationStatus').textContent = '';
            document.getElementById('detectedLang').style.display = 'none';
        }

        // Copy functions
        function copyInput() {
            const inputText = document.getElementById('inputText');
            inputText.select();
            document.execCommand('copy');
            showToast('Copied to clipboard');
        }

        function copyOutput() {
            const outputText = document.getElementById('outputText').textContent;
            navigator.clipboard.writeText(outputText).then(() => {
                showToast('Copied to clipboard');
            });
        }

        // Speak output
        function speakOutput() {
            const outputText = document.getElementById('outputText').textContent;
            const toLang = document.getElementById('toLang').value;
            if (outputText && 'speechSynthesis' in window) {
                const utterance = new SpeechSynthesisUtterance(outputText);
                const langMap = {'en': 'en-US', 'es': 'es-ES', 'fr': 'fr-FR', 'de': 'de-DE'};
                utterance.lang = langMap[toLang] || 'en-US';
                window.speechSynthesis.speak(utterance);
            }
        }

        // Show toast notification
        function showToast(message) {
            const toast = document.createElement('div');
            toast.style.cssText = 'position:fixed;bottom:20px;right:20px;background:#202124;color:white;padding:12px 20px;border-radius:4px;z-index:10000;font-size:14px;';
            toast.textContent = message;
            document.body.appendChild(toast);
            setTimeout(() => toast.remove(), 2000);
        }

        // Event listeners
        document.getElementById('fromLang').addEventListener('change', updateLanguageLabels);
        document.getElementById('toLang').addEventListener('change', updateLanguageLabels);
        document.getElementById('fromLang').addEventListener('change', () => {
            if (document.getElementById('inputText').value.trim()) {
                translateText();
            }
        });
        document.getElementById('toLang').addEventListener('change', () => {
            if (document.getElementById('inputText').value.trim()) {
                translateText();
            }
        });

        // Load sample data for preview
        fetch('data/sample_data.json')
            .then(response => response.json())
            .then(data => {
                const tbody = document.getElementById('datasetTable');
                tbody.innerHTML = '';
                data.slice(0, 10).forEach(row => {
                    const tr = document.createElement('tr');
                    tr.innerHTML = `
                        <td>${row.id}</td>
                        <td>${row.english}</td>
                        <td>${row.spanish}</td>
                        <td>${row.french}</td>
                        <td>${row.german}</td>
                    `;
                    tbody.appendChild(tr);
                });
            })
            .catch(error => {
                console.error('Error loading data:', error);
                document.getElementById('datasetTable').innerHTML = 
                    '<tr><td colspan="5" class="text-center">Loading sample data...</td></tr>';
            });

        // Initialize on page load
        window.addEventListener('DOMContentLoaded', () => {
            loadTranslationData();
            updateLanguageLabels();
        });
    </script>
</body>
</html>

1,222 lines•62.4 KB
markup

About RSK World

Founded by Molla Samser, with Designer & Tester Rima Khatun, RSK World is your one-stop destination for free programming resources, source code, and development tools.

Founder: Molla Samser
Designer & Tester: Rima Khatun

Development

  • Game Development
  • Web Development
  • Mobile Development
  • AI Development
  • Development Tools

Legal

  • Terms & Conditions
  • Privacy Policy
  • Disclaimer

Contact Info

Nutanhat, Mongolkote
Purba Burdwan, West Bengal
India, 713147

+91 93305 39277

hello@rskworld.in
support@rskworld.in

© 2026 RSK World. All rights reserved.

Content used for educational purposes only. View Disclaimer