Skip to content

REST API Guide

Serve all malaysian-manglish-nlp modules over HTTP with FastAPI - batch processing, rate limiting, and Docker deployment.


Why REST API?

Microservice integration, frontend consumption, mobile app backends, and team-wide NLP access without Python dependencies on every machine. Deploy once, call from anywhere.


Installation

pip install malaysian-manglish-nlp[api]

This installs FastAPI and uvicorn.


Start the server

Option 1: Python command

python -m malaysian_manglish_nlp.rest_api

Option 2: Uvicorn directly

uvicorn malaysian_manglish_nlp.rest_api:app --host 0.0.0.0 --port 8000

Option 3: With auto-reload (development)

uvicorn malaysian_manglish_nlp.rest_api:app --host 0.0.0.0 --port 8000 --reload

Server starts at http://localhost:8000.

Interactive docs

  • Swagger UI: http://localhost:8000/docs
  • ReDoc: http://localhost:8000/redoc

Endpoints

System

GET /health

Health check.

curl http://localhost:8000/health
{
  "status": "healthy",
  "version": "2.0.0",
  "modules_loaded": 51
}

GET /modules

List available NLP modules.

curl http://localhost:8000/modules
[
  {"name": "sentiment", "description": "Sentiment analysis", "endpoint": "/sentiment"},
  {"name": "normalize", "description": "Expand Manglish shortforms", "endpoint": "/normalize"},
  ...
]

NLP Endpoints

All NLP endpoints accept POST with JSON body:

{
  "text": "your text here",
  "options": {}
}

POST /sentiment

Sentiment analysis.

curl -X POST http://localhost:8000/sentiment \
  -H "Content-Type: application/json" \
  -d '{"text": "Best gila makanan kat sini!"}'
{
  "result": {"label": "positive", "score": 0.94},
  "processing_time_ms": 0.42
}

POST /normalize

Expand Manglish shortforms.

curl -X POST http://localhost:8000/normalize \
  -H "Content-Type: application/json" \
  -d '{"text": "nk tnya brp hrga"}'
{
  "result": "nak tanya berapa harga",
  "processing_time_ms": 0.18
}

POST /translate

Translate text.

curl -X POST http://localhost:8000/translate \
  -H "Content-Type: application/json" \
  -d '{"text": "Saya nak makan nasi lemak", "target": "en"}'
{
  "result": "I want to eat nasi lemak",
  "processing_time_ms": 0.95
}

Target options: en, bm, ms, formal

POST /ner

Named Entity Recognition.

curl -X POST http://localhost:8000/ner \
  -H "Content-Type: application/json" \
  -d '{"text": "Ahmad kerja kat Petronas KL"}'
{
  "result": [["Ahmad", "PERSON"], ["Petronas", "ORG"], ["KL", "LOCATION"]],
  "processing_time_ms": 0.87
}

POST /pos

Part-of-Speech tagging.

curl -X POST http://localhost:8000/pos \
  -H "Content-Type: application/json" \
  -d '{"text": "Saya suka makan nasi lemak"}'
{
  "result": [["Saya", "PRON"], ["suka", "VERB"], ["makan", "VERB"], ...],
  "processing_time_ms": 0.65
}

POST /summarize

Text summarization.

curl -X POST http://localhost:8000/summarize \
  -H "Content-Type: application/json" \
  -d '{"text": "Long article text here..."}'

POST /emotion

Emotion detection.

curl -X POST http://localhost:8000/emotion \
  -H "Content-Type: application/json" \
  -d '{"text": "Geram betul dengan service dia!"}'
{
  "result": {"primary": "anger", "score": 0.88, "secondary": "disgust"},
  "processing_time_ms": 0.38
}

POST /keywords

Keyword extraction.

curl -X POST http://localhost:8000/keywords \
  -H "Content-Type: application/json" \
  -d '{"text": "Harga minyak sawit meningkat ke paras tertinggi"}'

POST /language

Language detection.

curl -X POST http://localhost:8000/language \
  -H "Content-Type: application/json" \
  -d '{"text": "Weh jom la makan, I lapar gila"}'
{
  "result": {"primary": "manglish", "scores": {"ms": 0.45, "en": 0.55}},
  "processing_time_ms": 0.22
}

POST /formalize

Convert informal to formal BM.

curl -X POST http://localhost:8000/formalize \
  -H "Content-Type: application/json" \
  -d '{"text": "aku nk g mkn jap"}'

POST /dialect

Dialect detection.

curl -X POST http://localhost:8000/dialect \
  -H "Content-Type: application/json" \
  -d '{"text": "Ambo nok make nasi kerabu"}'

POST /analyze

Full analysis pipeline (all modules).

curl -X POST http://localhost:8000/analyze \
  -H "Content-Type: application/json" \
  -d '{"text": "Weh best gila kedai tu!"}'

Returns normalized text, sentiment, language, POS, entities, emotion, and keywords in one response.


Batch processing

POST /batch

Process multiple texts with multiple modules at once.

curl -X POST http://localhost:8000/batch \
  -H "Content-Type: application/json" \
  -d '{
    "texts": [
      "Best gila movie tu!",
      "Teruk la service dia",
      "Sedap nasi lemak Mak Cik"
    ],
    "modules": ["sentiment", "normalize", "ner"]
  }'
{
  "results": [
    {
      "text": "Best gila movie tu!",
      "sentiment": {"label": "positive", "score": 0.93},
      "normalize": "Best gila movie tu!",
      "ner": []
    },
    ...
  ],
  "processing_time_ms": 2.34,
  "count": 3
}

Available batch modules: sentiment, normalize, ner, pos, translate, emotion, keywords, language, formalize, summarize

Batch limits

Maximum 50 texts per batch request. Text max length: 10,000 characters.


Rate limiting

Built-in rate limiting: 100 requests per minute per IP.

Exceeded requests return HTTP 429:

{"error": "Rate limit exceeded. Try again later."}

Custom rate limits

Modify in code or via environment variable:

# In rest_api.py
rate_limiter = RateLimiter(max_requests=200, window_seconds=60)  # 200/min

Docker deployment

Using the included Dockerfile

# Build
docker build -t malaysian-manglish-nlp-api .

# Run
docker run -p 8000:8000 malaysian-manglish-nlp-api

Using docker-compose

docker-compose up -d

Dockerfile

FROM python:3.11-slim

WORKDIR /app
COPY . .
RUN pip install --no-cache-dir malaysian-manglish-nlp[api]

EXPOSE 8000
CMD ["uvicorn", "malaysian_manglish_nlp.rest_api:app", "--host", "0.0.0.0", "--port", "8000"]

docker-compose.yml

version: '3.8'
services:
  api:
    image: malaysian-manglish-nlp-api
    ports:
      - "8000:8000"
    environment:
      - UVICORN_WORKERS=4
    restart: unless-stopped

Python client usage

import requests

API = "http://localhost:8000"

# Sentiment
resp = requests.post(f"{API}/sentiment", json={"text": "Best gila!"})
print(resp.json())
# {'result': {'label': 'positive', 'score': 0.94}, 'processing_time_ms': 0.42}

# Batch
resp = requests.post(f"{API}/batch", json={
    "texts": ["Best!", "Teruk la"],
    "modules": ["sentiment", "normalize"]
})
for r in resp.json()['results']:
    print(r)

JavaScript/TypeScript client

const API = 'http://localhost:8000';

// Sentiment
const resp = await fetch(`${API}/sentiment`, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ text: 'Best gila makanan!' })
});
const data = await resp.json();
console.log(data.result);
// { label: 'positive', score: 0.94 }

// Batch
const batch = await fetch(`${API}/batch`, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({
        texts: ['Best!', 'Teruk la'],
        modules: ['sentiment', 'normalize']
    })
});

CLI usage

# Start server
$ uvicorn malaysian_manglish_nlp.rest_api:app --host 0.0.0.0 --port 8000

# Or use python
$ python -m malaysian_manglish_nlp.rest_api

# Docker
$ docker run -p 8000:8000 zafranyusof/malaysian-manglish-nlp:latest

Performance

Endpoint Avg Latency Throughput
/sentiment < 1ms 15,000 req/sec
/normalize < 0.5ms 30,000 req/sec
/translate < 2ms 8,000 req/sec
/ner < 1ms 12,000 req/sec
/analyze < 5ms 4,000 req/sec
/batch (10 texts) < 10ms 1,500 req/sec

Production deployment

For production, use multiple uvicorn workers behind a reverse proxy (nginx/traefik):

uvicorn malaysian_manglish_nlp.rest_api:app --host 0.0.0.0 --port 8000 --workers 4


See also