PostgreSQL Full-Text Search: Good Enough to Replace Elasticsearch?

Technical PM + Software Engineer
Teams reach for Elasticsearch because they want relevance, speed, and features. But most apps don’t need a distributed search cluster: PostgreSQL’s native full-text search (FTS) — tsvector, tsquery, GIN indexes and ranking functions — handles roughly 90% of typical search use cases with no extra service, no additional ops burden, and no added cost. This article is practical and implementation-first: how to model, index, query, rank, and operate Postgres FTS so you can confidently decide whether to skip Elasticsearch.
Core concepts: tsvector and tsquery
Postgres FTS separates document representation (tsvector) from queries (tsquery). A tsvector is a preprocessed, normalized list of lexemes extracted from text. A tsquery is a boolean/search expression built from lexemes and operators. Matching is performed with the @@ operator: document_vector @@ query_expression.
Use to_tsvector to build vectors and plainto_tsquery, phraseto_tsquery, or websearch_to_tsquery to parse user input. Choose a text search configuration (language) to get stemming and stop-word behavior; e.g., 'english'. Example: to_tsvector('english', coalesce(title, '')) || to_tsvector('english', coalesce(body, '')) produces a combined vector.
Practical reminder: normalize the text you index (coalesce, null handling). For multilingual data, store per-language vectors or a normalized language column and switch the config at index-time or query-time.
- tsvector: indexed document representation (lexemes, positions optional)
- tsquery: parsed query expression (AND, OR, phrase operators)
- @@ operator: tests match between vector and query
Indexing patterns: GIN, expression indexes, and triggers
The typical pattern is an expression index on a tsvector column using GIN: CREATE INDEX CONCURRENTLY idx_documents_search ON documents USING gin(search_vector); GIN is fast for containment queries and supports indexing tsvector efficiently.
Store a persistent search_vector column on the table, and maintain it using either application logic or a DB trigger. Postgres provides a built-in trigger function tsvector_update_trigger which is convenient: CREATE TRIGGER documents_vector_update BEFORE INSERT OR UPDATE ON documents FOR EACH ROW EXECUTE FUNCTION tsvector_update_trigger(search_vector, 'english', title, body); This keeps the vector consistent with columns without recomputing on read.
Create expression indexes when you want more control or want to index a produced expression without altering the table schema: CREATE INDEX CONCURRENTLY idx_documents_expr ON documents USING gin((setweight(to_tsvector('english', coalesce(title, '')), 'A') || setweight(to_tsvector('english', coalesce(body, '')), 'B')));
- Prefer GIN for general FTS; GiST can be useful for ranking-by-distance combos
- Use CREATE INDEX CONCURRENTLY in production to avoid locks
- Persisted search_vector + trigger = simplest operational model
Ranking and relevance: ts_rank and best practices
Postgres includes ranking functions ts_rank and ts_rank_cd. They compute relevance scores based on term frequency, document length, and position weights. A canonical pattern: SELECT id, ts_rank_cd(search_vector, plainto_tsquery('english', :q)) AS rank FROM documents WHERE search_vector @@ plainto_tsquery('english', :q) ORDER BY rank DESC LIMIT 20;
Use setweight when creating vectors to prioritize fields. Typical weighting: title 'A', summary 'B', body 'C', tags 'D'. Build the vector with setweight and then index it so the ranking reflects field importance.
If you need phrase prioritization, use phraseto_tsquery or combine adjacency operators: to_tsquery('english', 'open <-> source') requires exact adjacency for a stronger match. For Google-like input, websearch_to_tsquery is handy since it supports quoted phrases, minus-terms, and boolean-like syntax.
Measure and tune: ranking functions are cheap. Test with real queries and labels; tweak weights and use ts_rank_cd with normalization options for consistent scoring across documents of different lengths.
- Use setweight to reflect field importance
- ts_rank_cd is generally preferable for consistent scores
- phraseto_tsquery and <-> handle phrase matching; websearch_to_tsquery for user-like syntax
Query patterns and code examples
Here are compact, copy-paste-friendly SQL patterns you can adopt. Basic match using plainto_tsquery: SELECT id, ts_rank_cd(search_vector, plainto_tsquery('english', :q)) AS rank FROM documents WHERE search_vector @@ plainto_tsquery('english', :q) ORDER BY rank DESC LIMIT 25;
Phrase search with phraseto_tsquery: SELECT id FROM documents WHERE search_vector @@ phraseto_tsquery('english', 'exact phrase'); For proximity you can use adjacency operators with to_tsquery: WHERE search_vector @@ to_tsquery('english', 'open <2> source') which allows two lexemes between terms.
Web-style user input: postgres websearch_to_tsquery handles quoted phrases and negative terms. Example: WHERE search_vector @@ websearch_to_tsquery('english', :user_input). For highlighting/snippets: SELECT ts_headline('english', body, websearch_to_tsquery('english', :user_input), 'StartSel=<b>, StopSel=</b>, MaxWords=30') AS snippet FROM documents WHERE search_vector @@ websearch_to_tsquery('english', :user_input) ORDER BY ts_rank_cd(search_vector, websearch_to_tsquery('english', :user_input)) DESC LIMIT 10;
- Basic: plainto_tsquery -> good for simple tokenized searches
- Phrase: phraseto_tsquery or adjacency (<->) -> exact phrase control
- User-friendly: websearch_to_tsquery -> Google-style input handling
- Highlighting: ts_headline -> in-DB snippet generation
Advanced techniques and operational considerations
Fuzzy matching and typo tolerance: Postgres alone can be clunky for fuzzy within FTS. Combine pg_trgm similarity indexes for LIKE/ILIKE and similarity searches. Example: CREATE INDEX idx_docs_trgm ON documents USING gin (title gin_trgm_ops); Use trigram searches for fallback when FTS fails to match misspellings.
Index bloat and maintenance: GIN indexes can bloat over heavy write workloads. Autovacuum helps but monitor pg_stat_user_indexes and bloat; consider periodic REINDEX CONCURRENTLY for critical indexes. Use fastupdate on GIN if needed; note the tradeoffs in maintenance costs.
Scaling: for many applications a single Postgres instance (or primary + read replicas) suffices. If you need horizontal search scaling, consider Citus or an external search cluster. However, that complexity is only necessary for the top 10% of extreme workloads.
Operational simplicity: keeping search inside Postgres reduces data synchronization problems, simplifies backups, and leverages existing monitoring, authentication, and HA setups. You avoid schema synchronization tasks and the mental overhead of an extra service.
- Use pg_trgm for typo-tolerance and similarity fallbacks
- Monitor GIN index bloat and use REINDEX CONCURRENTLY when necessary
- Scale vertically first; consider Citus or external search only when justified
- Single-system search reduces operational surface area and cost
When Postgres is not enough: know the limits
Postgres FTS is a pragmatic, lower-cost choice, but it has limits: Elasticsearch and other engines shine when you need distributed indexing, near-real-time replication across many shards, advanced text-analysis pipelines, large-scale multi-tenant isolation, or very advanced scoring plugins. If you must support extremely high indexing throughput across many nodes, or need features like per-field analyzers with dozens of token filters, a dedicated search engine may be appropriate.
Before introducing Elasticsearch, quantify the pain: measure query latency, indexing throughput, expected growth, and features required (fuzzy algorithms, complex synonyms, per-field analyzers). Often you can incrementally extend Postgres with pg_trgm, materialized search tables, or tuned vacuum and still meet requirements.
- Use Postgres FTS for the majority of cases
- Introduce external search when you have clear, measured needs for distributed search features
- Always benchmark: data shape, query patterns, and operational constraints drive the decision
Conclusion
Postgres full-text search is mature, fast, and integrated. For most applications — product catalogs, documentation sites, blogs, help centers, and many internal tools — tsvector + tsquery + GIN + ts_rank provides everything you need without the complexity and cost of an extra search cluster. Implement persistent vectors with triggers, use weighted expression indexes for relevance, add websearch_to_tsquery for user-friendly parsing, and fall back to pg_trgm for fuzziness. Reserve Elasticsearch for workloads that truly require distributed indexing, extreme scale, or specialized analysis pipelines. Start with Postgres, measure, and only introduce external search when the data proves you need it.
Action Checklist
- Add a search_vector column to a representative table and populate it with to_tsvector; create a GIN index and test common queries with plainto_tsquery and websearch_to_tsquery.
- Implement a tsvector_update_trigger or an application update path for maintaining vectors and benchmark read latency and index update costs.
- Tune relevance: build weighted vectors with setweight, experiment with ts_rank_cd, and validate ordering against labeled examples.
- Add pg_trgm indexes for typo-tolerance as a fallback for queries that fail to match using FTS alone.
- Monitor index bloat and plan maintenance (REINDEX CONCURRENTLY) and only evaluate Elasticsearch after you have measured unmet performance or feature gaps.