REST API Guide¶

Serve all malaysian-manglish-nlp modules over HTTP with FastAPI - batch processing, rate limiting, and Docker deployment.

Why REST API?¶

Microservice integration, frontend consumption, mobile app backends, and team-wide NLP access without Python dependencies on every machine. Deploy once, call from anywhere.

Installation¶

pip install malaysian-manglish-nlp[api]

This installs FastAPI and uvicorn.

Start the server¶

Option 1: Python command¶

python -m malaysian_manglish_nlp.rest_api

Option 2: Uvicorn directly¶

uvicorn malaysian_manglish_nlp.rest_api:app --host 0.0.0.0 --port 8000

Option 3: With auto-reload (development)¶

uvicorn malaysian_manglish_nlp.rest_api:app --host 0.0.0.0 --port 8000 --reload

Server starts at http://localhost:8000.

Interactive docs

Swagger UI: http://localhost:8000/docs
ReDoc: http://localhost:8000/redoc

Endpoints¶

System¶

`GET /health`¶

Health check.

curl http://localhost:8000/health

{
  "status": "healthy",
  "version": "2.0.0",
  "modules_loaded": 51
}

`GET /modules`¶

List available NLP modules.

curl http://localhost:8000/modules

[
  {"name": "sentiment", "description": "Sentiment analysis", "endpoint": "/sentiment"},
  {"name": "normalize", "description": "Expand Manglish shortforms", "endpoint": "/normalize"},
  ...
]

NLP Endpoints¶

All NLP endpoints accept POST with JSON body:

{
  "text": "your text here",
  "options": {}
}

`POST /sentiment`¶

Sentiment analysis.

curl -X POST http://localhost:8000/sentiment \
  -H "Content-Type: application/json" \
  -d '{"text": "Best gila makanan kat sini!"}'

{
  "result": {"label": "positive", "score": 0.94},
  "processing_time_ms": 0.42
}

`POST /normalize`¶

Expand Manglish shortforms.

curl -X POST http://localhost:8000/normalize \
  -H "Content-Type: application/json" \
  -d '{"text": "nk tnya brp hrga"}'

{
  "result": "nak tanya berapa harga",
  "processing_time_ms": 0.18
}

`POST /translate`¶

Translate text.

curl -X POST http://localhost:8000/translate \
  -H "Content-Type: application/json" \
  -d '{"text": "Saya nak makan nasi lemak", "target": "en"}'

{
  "result": "I want to eat nasi lemak",
  "processing_time_ms": 0.95
}

Target options: en, bm, ms, formal

`POST /ner`¶

Named Entity Recognition.

curl -X POST http://localhost:8000/ner \
  -H "Content-Type: application/json" \
  -d '{"text": "Ahmad kerja kat Petronas KL"}'

{
  "result": [["Ahmad", "PERSON"], ["Petronas", "ORG"], ["KL", "LOCATION"]],
  "processing_time_ms": 0.87
}

`POST /pos`¶

Part-of-Speech tagging.

curl -X POST http://localhost:8000/pos \
  -H "Content-Type: application/json" \
  -d '{"text": "Saya suka makan nasi lemak"}'

{
  "result": [["Saya", "PRON"], ["suka", "VERB"], ["makan", "VERB"], ...],
  "processing_time_ms": 0.65
}

`POST /summarize`¶

Text summarization.

curl -X POST http://localhost:8000/summarize \
  -H "Content-Type: application/json" \
  -d '{"text": "Long article text here..."}'

`POST /emotion`¶

Emotion detection.

curl -X POST http://localhost:8000/emotion \
  -H "Content-Type: application/json" \
  -d '{"text": "Geram betul dengan service dia!"}'

{
  "result": {"primary": "anger", "score": 0.88, "secondary": "disgust"},
  "processing_time_ms": 0.38
}

`POST /keywords`¶

Keyword extraction.

curl -X POST http://localhost:8000/keywords \
  -H "Content-Type: application/json" \
  -d '{"text": "Harga minyak sawit meningkat ke paras tertinggi"}'

`POST /language`¶

Language detection.

curl -X POST http://localhost:8000/language \
  -H "Content-Type: application/json" \
  -d '{"text": "Weh jom la makan, I lapar gila"}'

{
  "result": {"primary": "manglish", "scores": {"ms": 0.45, "en": 0.55}},
  "processing_time_ms": 0.22
}

`POST /formalize`¶

Convert informal to formal BM.

curl -X POST http://localhost:8000/formalize \
  -H "Content-Type: application/json" \
  -d '{"text": "aku nk g mkn jap"}'

`POST /dialect`¶

Dialect detection.

curl -X POST http://localhost:8000/dialect \
  -H "Content-Type: application/json" \
  -d '{"text": "Ambo nok make nasi kerabu"}'

`POST /analyze`¶

Full analysis pipeline (all modules).

curl -X POST http://localhost:8000/analyze \
  -H "Content-Type: application/json" \
  -d '{"text": "Weh best gila kedai tu!"}'

Returns normalized text, sentiment, language, POS, entities, emotion, and keywords in one response.

Batch processing¶

`POST /batch`¶

Process multiple texts with multiple modules at once.

curl -X POST http://localhost:8000/batch \
  -H "Content-Type: application/json" \
  -d '{
    "texts": [
      "Best gila movie tu!",
      "Teruk la service dia",
      "Sedap nasi lemak Mak Cik"
    ],
    "modules": ["sentiment", "normalize", "ner"]
  }'

{
  "results": [
    {
      "text": "Best gila movie tu!",
      "sentiment": {"label": "positive", "score": 0.93},
      "normalize": "Best gila movie tu!",
      "ner": []
    },
    ...
  ],
  "processing_time_ms": 2.34,
  "count": 3
}

Available batch modules: sentiment, normalize, ner, pos, translate, emotion, keywords, language, formalize, summarize

Batch limits

Maximum 50 texts per batch request. Text max length: 10,000 characters.

Rate limiting¶

Built-in rate limiting: 100 requests per minute per IP.

Exceeded requests return HTTP 429:

{"error": "Rate limit exceeded. Try again later."}

Custom rate limits¶

Modify in code or via environment variable:

# In rest_api.py
rate_limiter = RateLimiter(max_requests=200, window_seconds=60)  # 200/min

Docker deployment¶

Using the included Dockerfile¶

# Build
docker build -t malaysian-manglish-nlp-api .

# Run
docker run -p 8000:8000 malaysian-manglish-nlp-api

Using docker-compose¶

docker-compose up -d

Dockerfile¶

FROM python:3.11-slim

WORKDIR /app
COPY . .
RUN pip install --no-cache-dir malaysian-manglish-nlp[api]

EXPOSE 8000
CMD ["uvicorn", "malaysian_manglish_nlp.rest_api:app", "--host", "0.0.0.0", "--port", "8000"]

docker-compose.yml¶

version: '3.8'
services:
  api:
    image: malaysian-manglish-nlp-api
    ports:
      - "8000:8000"
    environment:
      - UVICORN_WORKERS=4
    restart: unless-stopped

Python client usage¶

import requests

API = "http://localhost:8000"

# Sentiment
resp = requests.post(f"{API}/sentiment", json={"text": "Best gila!"})
print(resp.json())
# {'result': {'label': 'positive', 'score': 0.94}, 'processing_time_ms': 0.42}

# Batch
resp = requests.post(f"{API}/batch", json={
    "texts": ["Best!", "Teruk la"],
    "modules": ["sentiment", "normalize"]
})
for r in resp.json()['results']:
    print(r)

JavaScript/TypeScript client¶

const API = 'http://localhost:8000';

// Sentiment
const resp = await fetch(`${API}/sentiment`, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ text: 'Best gila makanan!' })
});
const data = await resp.json();
console.log(data.result);
// { label: 'positive', score: 0.94 }

// Batch
const batch = await fetch(`${API}/batch`, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({
        texts: ['Best!', 'Teruk la'],
        modules: ['sentiment', 'normalize']
    })
});

CLI usage¶

# Start server
$ uvicorn malaysian_manglish_nlp.rest_api:app --host 0.0.0.0 --port 8000

# Or use python
$ python -m malaysian_manglish_nlp.rest_api

# Docker
$ docker run -p 8000:8000 zafranyusof/malaysian-manglish-nlp:latest

Performance¶

Endpoint	Avg Latency	Throughput
`/sentiment`	< 1ms	15,000 req/sec
`/normalize`	< 0.5ms	30,000 req/sec
`/translate`	< 2ms	8,000 req/sec
`/ner`	< 1ms	12,000 req/sec
`/analyze`	< 5ms	4,000 req/sec
`/batch` (10 texts)	< 10ms	1,500 req/sec

Production deployment

For production, use multiple uvicorn workers behind a reverse proxy (nginx/traefik):

uvicorn malaysian_manglish_nlp.rest_api:app --host 0.0.0.0 --port 8000 --workers 4