Lucene query cheatsheet
Basic Search
- Single Term:
term
- Finds documents containing
term
.
- Finds documents containing
- Phrase Search:
"exact phrase"
- Finds documents containing the exact phrase.
Boolean Operators
- AND:
term1 AND term2
- Both terms must be present.
- OR:
term1 OR term2
- At least one of the terms must be present.
- NOT:
NOT term
- Documents must not contain
term
.
- Documents must not contain
- Combination:
(term1 AND term2) OR term3
- Complex boolean logic can be applied by combining operators.
Wildcard Searches
- Single Character Wildcard:
te?t
- Matches text with one character replaced.
- Multiple Character Wildcard:
test*
- Matches text with zero or more characters.
- Wildcard at Start:
*test
- Not supported directly but can be used in certain contexts.
Fuzzy Searches
- Fuzzy:
term~
- Matches terms that are similar to the specified term.
Proximity Searches
- Proximity:
"term1 term2"~N
- Matches terms that are within N words of each other.
Range Searches
- Range:
[start TO end]
- Finds documents with terms within the specified range.
- Exclusive Range:
{start TO end}
- Excludes the exact start and end values.
Regular Expressions
- Regex:
/regex/
- Matches terms by regular expression.
Boosting Terms
- Boost:
term^N
- Increases the relevance of a term by a factor of N.
Field-Specific Searches
- Specific Field:
fieldname:term
- Searches for the term within a specific field.
Grouping
- Group Queries:
(query1) AND (query2)
- Groups parts of queries for complex searches.
How to search Apache HTTPD using Lucene
These examples assume that the logs have been indexed in a Lucene-based system like Elasticsearch, and they demonstrate how to utilize various Lucene query features to filter and search log data effectively. Note that the specific fields used in these examples (ip
, timestamp
, response
, request
, etc.) should correspond to the fields defined in your Lucene schema for Apache HTTPD logs.
// 1. Find logs for a specific IP address
ip:"192.168.1.1"
// 2. Search logs within a specific date range
timestamp:[20230101 TO 20230131]
// 3. Identify logs with 4xx client error response codes
response:[400 TO 499]
// 4. Locate logs for requests to a specific URL
request:"GET /index.html HTTP/1.1"
// 5. Filter logs by a specific user-agent string
agent:"Mozilla/5.0 (Windows NT 10.0; Win64; x64)"
// 6. Search for logs with a specific referrer
referrer:"http://example.com/"
// 7. Find all logs of GET requests
request_method:GET
// 8. Filter logs resulting in 5xx server errors
response:[500 TO 599]
// 9. Identify requests to a specific directory
request:"/images/*"
// 10. Locate requests taking longer than 2 seconds
duration:>2000
// 11. Exclude logs from a specific IP address
-ip:"192.168.1.1"
// 12. Find requests for a specific file type (.jpg)
request:"*.jpg"
// 13. Identify logs from a specific day
timestamp:20230115
// 14. Search logs with responses in a byte range
bytes:[1000 TO 5000]
// 15. Filter logs by HTTP method and response code
request_method:POST AND response:200
// 16. Search for failed login attempts (custom log message)
message:"Failed login attempt"
// 17. Find logs from a range of IP addresses
ip:[192.168.1.1 TO 192.168.1.100]
// 18. Identify logs with a 200 OK response
response:200
// 19. Search for logs with specific query parameters
request:"*?user=john&*"
// 20. Locate logs with a 404 Not Found response
response:404
I’m a DevOps/SRE/DevSecOps/Cloud Expert passionate about sharing knowledge and experiences. I am working at Cotocus. I blog tech insights at DevOps School, travel stories at Holiday Landmark, stock market tips at Stocks Mantra, health and fitness guidance at My Medic Plus, product reviews at I reviewed , and SEO strategies at Wizbrand.
Please find my social handles as below;
Rajesh Kumar Personal Website
Rajesh Kumar at YOUTUBE
Rajesh Kumar at INSTAGRAM
Rajesh Kumar at X
Rajesh Kumar at FACEBOOK
Rajesh Kumar at LINKEDIN
Rajesh Kumar at PINTEREST
Rajesh Kumar at QUORA
Rajesh Kumar at WIZBRAND