๐ Project Overview
NLP Text Analysis Bot is an advanced, feature-rich Python application that provides comprehensive text analysis capabilities using Natural Language Processing. This project offers text preprocessing, sentiment analysis, entity recognition, semantic understanding, text summarization, language detection, emotion detection, readability analysis, text classification, POS tagging, text similarity, advanced keyword extraction, and a beautiful Flask web interface. Perfect for developers looking to build NLP applications or integrate text analysis functionality into their projects.
โก Quick Facts
โจ Features
Core Features
๐งน Text Preprocessing
Cleaning, tokenization, stopword removal, and lemmatization for clean text analysis.
๐ Sentiment Analysis
Multi-method sentiment detection using VADER and transformer models for accurate sentiment classification.
๐ Entity Recognition
Named entity extraction using spaCy to identify people, organizations, locations, and more.
๐ง Semantic Understanding
Keyword extraction, topic identification, and phrase analysis for deep text understanding.
Advanced Features
๐ Text Summarization
Extractive and abstractive summarization using transformer models for concise summaries.
๐ Language Detection
Automatic language detection with confidence scores for multi-language text analysis.
๐ท๏ธ Text Classification
Zero-shot text classification into multiple categories with confidence scores.
๐ข Emotion Detection
Advanced emotion analysis beyond basic sentiment (joy, sadness, anger, fear, etc.).
๐ Readability Analysis
Multiple readability metrics (Flesch, SMOG, Coleman-Liau, etc.) and grade levels.
๐ Advanced Keywords
TF-IDF based keyword extraction with n-grams for important term identification.
๐ POS Tagging
Complete part-of-speech analysis with distribution statistics and tagging.
๐ Text Similarity
Calculate similarity between texts using multiple methods (cosine, Jaccard, etc.).
๐ Flask Web Interface
Beautiful and modern Flask-based web application with responsive design and visualizations.
๐ Data Visualization
Interactive charts and graphs for sentiment distribution, entity types, and analysis metrics.
๐ Advanced Error Handling
Comprehensive error handling with user-friendly messages and graceful fallbacks.
๐ Well Documented
Complete documentation with examples, quick start guide, and detailed installation instructions.
๐ ๏ธ Technologies
Python 3.8+
Modern Python programming language
LanguageFlask 2.3.0+
Lightweight Python web framework
FrameworkNLTK 3.6+
Natural Language Toolkit for text processing
NLP LibraryspaCy 3.4.0+
Advanced NLP library for entity extraction
NLP Libraryscikit-learn 1.0.0+
Machine learning library for NLP
ML LibraryNumPy 1.21.0+
Numerical computing library
LibraryRequests 2.28.0+
HTTP library for API calls
Library๐ฆ Installation Guide - Step by Step
โฑ๏ธ Installation Time: ~10 minutes
Follow these detailed steps to install and set up the NLP Text Analysis Bot on your system.
๐ Prerequisites
๐ฆ pip Package Manager
Usually comes with Python. Verify: pip --version
๐ป Terminal/Command Prompt
Windows: PowerShell or CMD
Linux/Mac: Terminal
๐ Internet Connection
Required for downloading packages and NLP models
๐ Step-by-Step Installation
Option A: Download ZIP
- Download the project ZIP file from the repository
- Extract it to your desired location (e.g.,
C:\Projects\nlp-text-analysis-botor~/Projects/nlp-text-analysis-bot) - Open terminal/command prompt in the extracted folder
Option B: Clone with Git
git clone https://github.com/rskworld/nlp-text-analysis-bot.git
cd nlp-text-analysis-bot
Virtual environments isolate project dependencies and prevent conflicts.
Windows:
# Create virtual environment
python -m venv venv
# Activate virtual environment
venv\Scripts\activate
# You should see (venv) in your prompt
Linux/Mac:
# Create virtual environment
python3 -m venv venv
# Activate virtual environment
source venv/bin/activate
# You should see (venv) in your prompt
This will install all required packages including NLTK, spaCy, Flask, Transformers, etc.
# Upgrade pip first (recommended)
pip install --upgrade pip
# Install all dependencies
pip install -r requirements.txt
# This may take 5-10 minutes depending on your internet speed
Required for entity recognition and advanced NLP features.
# Download spaCy English model
python -m spacy download en_core_web_sm
# This downloads ~50MB model file
NLTK data is usually downloaded automatically on first use, but you can download it manually:
# Download required NLTK data
python -c "import nltk; nltk.download('punkt'); nltk.download('stopwords'); nltk.download('wordnet'); nltk.download('vader_lexicon'); nltk.download('averaged_perceptron_tagger')"
Run the validation script to check if everything is installed correctly:
# Run validation script
python validate_project.py
# This will check all dependencies and models
Start the Flask web interface:
# Start Flask web interface
python app.py
# You should see:
# * Running on http://127.0.0.1:5000
# Press CTRL+C to quit
๐ Access the Application:
- Open your web browser
- Navigate to: http://localhost:5000 or http://127.0.0.1:5000
- You should see the NLP Text Analysis Bot interface
โ Installation Complete!
Congratulations! You've successfully installed the NLP Text Analysis Bot. The Flask web interface provides the best user experience with all advanced features including:
- Text preprocessing and analysis
- Sentiment analysis with visualizations
- Entity recognition
- Text summarization
- Language detection
- Emotion detection
- Readability analysis
- Interactive charts and graphs
Next Steps:
- Open http://localhost:5000 in your browser
- Enter some text in the input field
- Click "Analyze Text" to see comprehensive analysis results
- Explore all the features and visualizations!
๐ง Troubleshooting Installation
โ ModuleNotFoundError
Solution: Make sure virtual environment is activated and run pip install -r requirements.txt again.
โ spaCy Model Not Found
Solution: Run python -m spacy download en_core_web_sm
โ NLTK Data Missing
Solution: Run the NLTK download command from Step 5, or it will download automatically on first use.
โ Port Already in Use
Solution: Change port in app.py or stop the process using port 5000.
๐ Usage Guide - Step by Step
๐ฏ Getting Started with the Web Interface
Step-by-Step Demo Instructions
๐ Step 1: Start the Application
- Open terminal/command prompt in the project directory
- Activate virtual environment (if not already activated):
# Windows: venv\Scripts\activate
# Linux/Mac: source venv/bin/activate - Run the Flask application:
python app.py - Wait for the message: * Running on http://127.0.0.1:5000
๐ Step 2: Open the Web Interface
- Open your web browser (Chrome, Firefox, Safari, or Edge)
- Navigate to: http://localhost:5000
- You should see the NLP Text Analysis Bot interface with:
- Text input area
- Analyze button
- Results display area
- Visualization charts section
๐ Step 3: Analyze Your First Text
- Enter Sample Text: Type or paste text in the input field. Example:
"I love this product! It's amazing and works perfectly. The customer service is excellent too." - Click "Analyze Text" button
- Wait for Analysis: The system will process your text (takes 2-5 seconds)
- View Results: You'll see comprehensive analysis including:
- ๐ Sentiment scores and distribution chart
- ๐ท๏ธ Named entities (people, organizations, locations)
- ๐ Text summary
- ๐ Detected language
- ๐ Emotion analysis
- ๐ Readability metrics
- ๐ Keywords and topics
- ๐ POS tagging results
๐ Step 4: Explore Visualizations
The interface includes interactive charts and graphs:
- Sentiment Distribution Chart: Pie or bar chart showing positive/negative/neutral sentiment
- Entity Type Chart: Bar chart showing different entity types found
- Emotion Distribution: Visual representation of detected emotions
- Readability Metrics: Comparison charts for different readability scores
- Keyword Frequency: Word cloud or bar chart of important keywords
๐ Step 5: Try Different Text Types
Experiment with various text samples to see different analysis results:
- Product Reviews: Analyze customer feedback sentiment
- News Articles: Extract entities and summarize content
- Social Media Posts: Detect emotions and sentiment
- Technical Documents: Analyze readability and extract keywords
- Multi-language Text: Test language detection
๐ป Using as Python Module
You can also use the NLP pipeline programmatically in your Python code:
from nlp_pipeline import NLPPipeline
# Initialize pipeline
pipeline = NLPPipeline()
# Analyze text
results = pipeline.analyze("Your text here")
# Access results
print(results['sentiment'])
print(results['entities'])
print(results['summary'])
๐ Features Usage
๐งน Text Preprocessing
Automatic cleaning, tokenization, and normalization of input text
๐ Sentiment Analysis
Get sentiment scores (positive, negative, neutral) with confidence levels
๐ท๏ธ Entity Recognition
Automatically extracts people, organizations, locations, dates, and more
๐ Text Summarization
Generate concise summaries of long texts using extractive or abstractive methods
๐ Language Detection
Automatically detects text language with confidence scores
๐ข Emotion Detection
Identifies emotions like joy, sadness, anger, fear beyond basic sentiment
๐ Readability Analysis
Multiple readability metrics (Flesch, SMOG, Coleman-Liau) and grade levels
๐ Keyword Extraction
TF-IDF based keyword extraction with n-grams for important terms
๐ Data Visualizations & Charts
The NLP Text Analysis Bot includes interactive charts and graphs to visualize analysis results. Here are examples of the visualizations you'll see:
๐ Sentiment Analysis Chart
This pie chart shows the distribution of sentiment in analyzed text (Positive, Negative, Neutral).
๐ท๏ธ Entity Type Distribution
Bar chart displaying different types of named entities found in the text (Person, Organization, Location, etc.).
๐ Emotion Distribution
Visual representation of detected emotions (joy, sadness, anger, fear, surprise, etc.).
๐ Readability Metrics Comparison
Comparison of different readability scores (Flesch, SMOG, Coleman-Liau, etc.).
๐ป Code Examples
Basic Python Usage
from nlp_pipeline import NLPPipeline
# Create pipeline instance
pipeline = NLPPipeline()
# Analyze text
text = "I love this product! It's amazing and works perfectly."
results = pipeline.analyze(text)
# Access results
print(results['sentiment'])
print(results['entities'])
print(results['summary'])
Advanced Features Usage
from nlp_pipeline import NLPPipeline
pipeline = NLPPipeline()
# Complete text analysis
text = "The new AI technology from OpenAI is revolutionizing how we work."
results = pipeline.analyze(text)
# Sentiment Analysis
print(f"Sentiment: {results['sentiment']['label']}")
print(f"Confidence: {results['sentiment']['confidence']}")
# Entity Recognition
for entity in results['entities']:
print(f"{entity['text']} - {entity['label']}")
# Text Summarization
print(f"Summary: {results['summary']['text']}")
# Language Detection
print(f"Language: {results['language']['language']}")
print(f"Confidence: {results['language']['confidence']}")
# Emotion Detection
print(f"Primary Emotion: {results['emotion']['primary']}")
# Readability Analysis
print(f"Flesch Score: {results['readability']['flesch']}")
print(f"Grade Level: {results['readability']['grade_level']}")
Flask Web Interface
# Run Flask web interface
from app import app
if __name__ == '__main__':
app.run(debug=True, host='0.0.0.0', port=5000)
# Or simply run:
# python app.py
Individual Module Usage
# Sentiment Analysis
from sentiment_analysis import SentimentAnalyzer
analyzer = SentimentAnalyzer()
sentiment = analyzer.analyze("I'm feeling great today!")
print(sentiment) # {'label': 'positive', 'score': 0.95}
# Entity Recognition
from entity_recognition import EntityRecognizer
recognizer = EntityRecognizer()
entities = recognizer.extract("Apple Inc. is located in Cupertino, California")
print(entities) # [{'text': 'Apple Inc.', 'label': 'ORG'}, ...]
# Text Summarization
from text_summarization import TextSummarizer
summarizer = TextSummarizer()
summary = summarizer.summarize("Long text here...", max_length=100)
print(summary)
# Language Detection
from language_detection import LanguageDetector
detector = LanguageDetector()
lang = detector.detect("Bonjour, comment allez-vous?")
print(lang) # {'language': 'fr', 'confidence': 0.99}
Configuration
# config.py
ENABLE_SENTIMENT_ANALYSIS = True
ENABLE_ENTITY_RECOGNITION = True
ENABLE_TEXT_SUMMARIZATION = True
ENABLE_EMOTION_DETECTION = True
ENABLE_READABILITY_ANALYSIS = True
# Access in code
from config import ENABLE_SENTIMENT_ANALYSIS
print(ENABLE_SENTIMENT_ANALYSIS)
๐ API Endpoints
Flask Web API
The application provides REST API endpoints for text analysis through the Flask web framework.
Available API Endpoints
| Endpoint | Method | Description |
|---|---|---|
| /api/analyze | POST | Complete text analysis with all features |
| /api/similarity | POST | Calculate similarity between two texts |
| /api/health | GET | Health check endpoint |
| / | GET | Web interface homepage |
API Usage Examples
# POST /api/analyze
import requests
url = "http://localhost:5000/api/analyze"
data = {
"text": "I love this product! It's amazing."
}
response = requests.post(url, json=data)
results = response.json()
print(results['sentiment'])
print(results['entities'])
print(results['summary'])
# POST /api/similarity
import requests
url = "http://localhost:5000/api/similarity"
data = {
"text1": "I love Python programming",
"text2": "Python is my favorite language"
}
response = requests.post(url, json=data)
similarity = response.json()
print(f"Similarity: {similarity['similarity']}")
Response Format
{
"original_text": "...",
"preprocessing": {...},
"sentiment": {
"label": "positive",
"score": 0.95,
"confidence": 0.92
},
"entities": [...],
"semantic": {...},
"summary": {...},
"language": {...},
"emotion": {...},
"readability": {...}
}
โ๏ธ Configuration
Configuration in this Python application is handled through:
Configuration File
Edit config.py file in the root directory:
# config.py
DEFAULT_LANGUAGE = 'en'
ENABLE_ANALYTICS = True
ENABLE_SENTIMENT_ANALYSIS = True
ENABLE_API_INTEGRATIONS = True
# Optional API keys
WEATHER_API_KEY = None
NEWS_API_KEY = None
Note: API keys are optional. The bot works without them but with limited functionality.
Environment Variables
You can also use environment variables:
export DEFAULT_LANGUAGE=en
export ENABLE_ANALYTICS=true
export WEATHER_API_KEY=your_key_here
Runtime Configuration
Configure settings programmatically:
- Language: Use bot.set_language('es') to change language
- Analytics: Enable/disable through config.py
- API Integrations: Configure API keys in config.py
- Response Templates: Customize in response_templates.py
Configuration changes require restarting the application.
Web Interface Configuration
The Flask web interface can be configured in app.py. Default port is 5000, but you can change it in the run configuration.
๐ Project Structure
Note: Edit config.py to customize settings and optional API keys.
๐ Detailed File Descriptions
๐ง nlp_pipeline.py
Purpose: Main NLP pipeline orchestrator. Coordinates all NLP modules for comprehensive text analysis.
Key Features:
- Main pipeline class
- Orchestrates all NLP modules
- End-to-end text analysis
- Error handling and fallbacks
- Result aggregation
- Performance optimization
๐งน text_preprocessing.py
Purpose: Text preprocessing module. Cleans, tokenizes, and normalizes text for analysis.
Key Features:
- Text cleaning
- Tokenization
- Stopword removal
- Lemmatization
๐ sentiment_analysis.py
Purpose: Sentiment analysis module. Multi-method sentiment detection using VADER and transformers.
Key Features:
- VADER sentiment analysis
- Transformer-based analysis
- Sentiment classification
- Confidence scoring
๐ท๏ธ entity_recognition.py
Purpose: Named entity recognition module. Extracts entities using spaCy.
Key Features:
- Named entity extraction
- Entity type classification
- Location, person, organization detection
- Date and time extraction
๐ง semantic_understanding.py
Purpose: Semantic analysis module. Keyword extraction and topic identification.
Key Features:
- Keyword extraction
- Topic identification
- Phrase analysis
- Semantic relationships
๐ text_summarization.py
Purpose: Text summarization module. Generates concise summaries using extractive and abstractive methods.
Key Features:
- Extractive summarization
- Abstractive summarization
- Summary length control
- Transformer-based models
๐ language_detection.py
Purpose: Language detection module. Automatically detects text language with confidence scores.
Key Features:
- Automatic language detection
- Confidence scoring
- Multi-language support
- Language probability distribution
๐ท๏ธ text_classification.py
Purpose: Text classification module. Zero-shot classification into multiple categories.
Key Features:
- Zero-shot classification
- Category assignment
- Confidence scores
- Custom categories
๐ข emotion_detection.py
Purpose: Emotion detection module. Advanced emotion analysis beyond basic sentiment.
Key Features:
- Emotion classification
- Emotion distribution
- Primary emotion detection
- Emotion intensity scoring
๐ readability_analysis.py
Purpose: Readability analysis module. Multiple readability metrics and grade levels.
Key Features:
- Flesch reading ease
- SMOG index
- Coleman-Liau index
- Grade level calculation
โ๏ธ config.py
Purpose: Configuration module. Contains settings, constants, and configuration options.
Key Features:
- Application settings
- Default configurations
- API key management
- Feature toggles
๐ app.py
Purpose: Flask web interface application. Provides web-based interface for the chatbot.
Key Features:
- Flask web server
- REST API endpoints
- Web interface routes
- Template rendering
๐ฆ requirements.txt
Purpose: Python dependencies list. Contains all required packages and versions.
Key Features:
- Dependency management
- Version specifications
- Package listings
- Installation instructions
๐ README.md
Purpose: Project overview and quick start guide. Provides introduction, features, installation instructions, and usage examples.
Contents:
- Project description
- Features list
- Installation guide
- Usage instructions
- Project structure
- Support information
๐ RELEASE_NOTES.md
Purpose: Release notes. Documents features, changes, and updates in the current version.
Contents:
- Release information
- Feature list
- Technical details
- Changelog
โ๏ธ LICENSE
Purpose: MIT License file. Contains the full MIT License text and copyright information.
License Type: MIT License
Copyright: ยฉ 2026 RSK World
๐ซ .gitignore
Purpose: Git ignore rules. Specifies files and directories that should not be tracked by version control.
Excluded Items:
- .env files (API keys)
- node_modules/ directory
- build/ directory
- IDE configuration files
- OS-specific files
- Log files
๐ File Statistics
๐ File Organization
Core NLP Modules: nlp_pipeline.py, text_preprocessing.py, sentiment_analysis.py, entity_recognition.py, semantic_understanding.py
Advanced Features: text_summarization.py, language_detection.py, emotion_detection.py, readability_analysis.py, text_classification.py, pos_tagging.py, text_similarity.py, advanced_keywords.py
Documentation: README.md, QUICKSTART.md, ADVANCED_FEATURES.md, LICENSE
Configuration: config.py, requirements.txt, .gitignore
Web Interface: app.py, templates/index.html, static/css/style.css, static/js/main.js
๐ Advanced Features Details
1. Context-Aware Conversations
Advanced context management maintains conversation context across multiple turns. The bot remembers previous messages, user information, and conversation history, enabling coherent and natural multi-turn dialogues.
2. Intent Recognition & Entity Extraction
Intelligent intent recognition identifies user intentions from natural language using pattern matching and machine learning. Entity extraction automatically identifies names, dates, locations, and other important information from messages.
3. Sentiment Analysis
Sentiment analysis module analyzes user sentiment (positive, negative, neutral) to provide better, empathetic responses. The bot adapts its tone and responses based on user emotions.
4. Multi-Language Support
Supports 8+ languages (English, Spanish, French, German, Hindi, Chinese, Japanese, Arabic) with automatic language detection. Users can switch languages or the bot detects the language automatically.
5. API Integrations
Integrated with external APIs for weather information, news articles, jokes, quotes, and calculations. The bot can fetch real-time data and provide enhanced functionality beyond basic conversations.
6. Conversation Analytics
Comprehensive analytics tracking metrics, intent distribution, session statistics, and conversation patterns. Provides insights into user engagement and bot performance.
7. Flask Web Interface
Beautiful Flask-based web interface with responsive design. Provides easy-to-use web interface for interacting with the chatbot, viewing analytics, and managing conversations.
8. Modular & Extensible
Well-organized modular design makes it easy to extend with new features, integrations, and customizations. Each module is independent and can be modified or extended easily.
๐ง Troubleshooting
Installation Issues
- Make sure you're using Python 3.8 or higher: python --version
- Install all dependencies: pip install -r requirements.txt
- Use virtual environment: python -m venv venv then activate it
- If pip install fails, try: pip install --upgrade pip first
NLTK Data Issues
- Download required NLTK data: python -c "import nltk; nltk.download('punkt'); nltk.download('stopwords')"
- If NLTK download fails, check internet connection
- NLTK data is downloaded automatically on first use
Import Errors
- Make sure you're in the project directory
- Activate virtual environment before running
- Check that all dependencies are installed: pip list
- If module not found, reinstall: pip install -r requirements.txt --force-reinstall
Common Issues
- Module not found errors: Run pip install -r requirements.txt to install all dependencies
- Port already in use: The default Flask port is 5000. Change it in app.py or set PORT environment variable
- Virtual environment issues: Make sure virtual environment is activated before running
- API integration errors: API keys are optional. The bot works without them but with limited functionality
- Context errors: Check that conversation_history.json is writable
๐ Requirements
numpy>=1.21.0
scikit-learn>=1.0.0
nltk>=3.6
spacy>=3.4.0
python-dateutil>=2.8.2
colorama>=0.4.4
setuptools>=65.0.0
flask>=2.3.0
requests>=2.28.0
See requirements.txt for the complete list of dependencies.
Python Version: Python 3.8 or higher required.
๐ฏ Use Cases
๐ป Development
AI coding assistant for developers
๐ Education
Learning and tutoring platform
๐ผ Business
Professional consultation and advice
โ๏ธ Creative Writing
Storytelling and content creation
๐ฌ General Chat
Casual conversations and assistance
๐ Multi-Language
8+ languages with automatic detection
๐ฌ Support
For support, questions, or more projects:
- Website: https://rskworld.in
- Email: help@rskworld.in
- Phone: +91 93305 39277
๐ License
This project is provided as-is for educational and development purposes.
MIT License - See LICENSE file for details.
๐ Demo Folder Structure
The demo/ folder contains demonstration and documentation files for this project.
demo/
โโโ index.html # This documentation page
โโโ demo.html # Interactive demo page
โโโ style.css # Stylesheet (optional, styles are inline)
โโโ script.js # JavaScript (optional, can be added for interactivity)
๐ Demo Files Description
๐ index.html
Purpose: Comprehensive project documentation and information page. This HTML file contains all details about the NLP Text Analysis Bot project.
Contents:
- Complete project overview
- All features documentation
- Installation instructions
- Usage examples
- Code examples
- API integration reference
- Configuration details
- Detailed file and folder descriptions
- Project structure
- Troubleshooting guide
- Support information
Features:
- Self-contained HTML with inline CSS
- Responsive design
- Modern, beautiful UI
- Well-organized sections
- Easy navigation
๐จ style.css
Purpose: External stylesheet file (optional). Currently, all styles are embedded inline in index.html, but this file can be used for additional custom styles if needed.
Status: Empty file - can be used for custom styling
Usage: Add custom CSS styles here if you want to override or extend the inline styles in index.html
๐ script.js
Purpose: External JavaScript file (optional). Can be used to add interactive features to the documentation page.
Status: Empty file - can be used for additional functionality
Potential Uses:
- Table of contents navigation
- Smooth scrolling
- Search functionality
- Copy code snippets
- Theme toggle
- Print functionality
- Interactive elements
๐ฎ demo.html
Purpose: Interactive demo page showcasing the NLP Text Analysis Bot features in action.
Features:
- Live chat interface
- Context management demonstration
- Intent recognition and entity extraction
- Sentiment analysis display
- API integrations (weather, news, jokes, calculations)
- Real-time statistics and analytics
- Quick action buttons
- Beautiful, responsive UI
Usage: Open demo.html in your browser to try the interactive demo!
๐ก About the Demo Folder
The demo/ folder is separate from the main nlp-text-analysis-bot/ project folder. It contains:
- Documentation: This comprehensive HTML documentation page that explains the entire project
- Interactive Demo: demo.html - Live interactive demo showcasing all chatbot features
- Styling: Optional CSS file for custom styling
- Scripts: Optional JavaScript file for enhanced interactivity
Note: To view this documentation, simply open demo/index.html in any web browser. To try the interactive demo, open demo.html. Both pages are self-contained and work offline.