Popularity
7.0
Stable
Activity
2.1
-
1,810
99
327

Programming language: JavaScript
License: MIT License
Tags: Search Engines    
Latest version: v2.1.19

Ambar alternatives and similar software solutions

Based on the "Search Engines" category.
Alternatively, view Ambar alternatives based on common mentions on social networks and blogs.

  • MeiliSearch

    A lightning-fast search API that fits effortlessly into your apps, websites, and workflow
  • Searx

    9.3 7.7 L2 Ambar VS Searx
    DISCONTINUED. Privacy-respecting metasearch engine
  • The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
    Promo workos.com
    WorkOS Logo
  • Typesense

    Open Source alternative to Algolia + Pinecone and an Easier-to-Use alternative to ElasticSearch ⚡ 🔍 ✨ Fast, typo tolerant, in-memory fuzzy Search Engine for building delightful search experiences
  • Yacy

    7.3 8.7 L1 Ambar VS Yacy
    Distributed Peer-to-Peer Web Search Engine and Intranet Search Appliance
  • Gigablast

    Nov 20 2017 -- A distributed open source search engine and spider/crawler written in C/C++ for Linux on Intel/AMD. From gigablast dot com, which has binaries for download. See the README.md file at the very bottom of this page for instructions.
  • sist2

    Lightning-fast file system indexer and search tool
  • Seeks

    3.5 0.0 L1 Ambar VS Seeks
    Seeks is a decentralized p2p websearch and collaborative tool.
  • multiSearchHome

    :mag_right: Local standalone html homepage to search in 175 search engine (duckduckgo, youtube, twitter, wikipedia, etc..) // FR___: Page d'accueil html autonome, pour chercher dans 175 moteurs de recherche.

Do you think we are missing an alternative of Ambar or a related project?

Add another 'Search Engines' Software solution

README

Version License

:mag: Ambar: Document Search Engine

Ambar Search

⚠️ PROJECT ARCHIVED ⚠️

Ambar is an open-source document search engine with automated crawling, OCR, tagging and instant full-text search.

Ambar defines a new way to implement full-text document search into your workflow.

  • Easily deploy Ambar with a single docker-compose file
  • Perform Google-like search through your documents and contents of your images
  • Tag your documents
  • Use a simple REST API to integrate Ambar into your workflow

Features

Search

Tutorial: Mastering Ambar Search Queries

  • Fuzzy Search (John~3)
  • Phrase Search ("John Smith")
  • Search By Author (author:John)
  • Search By File Path (filename:*.txt)
  • Search By Date (when: yesterday, today, lastweek, etc)
  • Search By Size (size>1M)
  • Search By Tags (tags:ocr)
  • Search As You Type
  • Supported language analyzers: English ambar_en, Russian ambar_ru, German ambar_de, Italian ambar_it, Polish ambar_pl, Chinese ambar_cn, CJK ambar_cjk

Crawling

Ambar 2.0 only supports local fs crawling, if you need to crawl an SMB share of an FTP location - just mount it using standard linux tools. Crawling is automatic, no schedule is needed due to crawlers monitor file system events and automatically process new, changed and removed files.

Content Extraction

Ambar supports large files (>30MB)

Supported file types:

  • ZIP archives
  • Mail archives (PST)
  • MS Office documents (Word, Excel, Powerpoint, Visio, Publisher)
  • OCR over images
  • Email messages with attachments
  • Adobe PDF (with OCR)
  • OCR languages: Eng, Rus, Ita, Deu, Fra, Spa, Pl, Nld
  • OpenOffice documents
  • RTF, Plaintext
  • HTML / XHTML
  • Multithread processing

Installation

Notice: Ambar requires Docker to run

You can build Docker images by yourself

  • Tutorial on how to build images from scratch see below

Building the images yourself

All the images required to run Ambar can be built locally. In general, each image can be built by navigating into the directory of the component in question, performing the compilation steps required and building the image like that:

# From project root
$ cd FrontEnd
$ docker build . -t <image_name>

The resulting image can be referred to by the name specified, and run by the containerization tooling of your choice.

In order to use a local Dockerfile with docker-compose, simply change the image option to build, setting the value to the relative path of the directory containing the Dockerfile. Then run docker-compose build to build the relevant images. For example:

# docker-compose.yml from project root, referencing local dockerfiles
pipeline0:
  build: ./Pipeline/
image: chazu/ambar-pipeline
  localcrawler:
    image: ./LocalCrawler/

Note that some of the components require compilation or other build steps be performed on the host before the docker images can be built. For example, FrontEnd:

# Assuming a suitable version of node.js is installed (docker uses 8.10)
$ npm install
$ npm run compile

Then follow this instructions -> https://ambar.cloud/docs/installation

FAQ

Is it open-source?

Yes, it's fully open-source.

Is it free?

Yes, it is forever free and open-source.

Does it perform OCR?

Yes, it performs OCR on images (jpg, tiff, bmp, etc) and PDF's. OCR is perfomed by well-known open-source library Tesseract. We tuned it to achieve best perfomance and quality on scanned documents. You can easily find all files on which OCR was perfomed with tags:ocr query

Which languages are supported for OCR?

Supported languages: Eng, Rus, Ita, Deu, Fra, Spa, Pl, Nld.

Does it support tagging?

Yes!

What about searching in PDF?

Yes, it can search through any PDF, even badly encoded or with scans inside. We did our best to make search over any kind of pdf document smooth.

What is the maximum file size it can handle?

It's limited by amount of RAM on your machine, typically it's 500MB. It's an awesome result, as typical document managment systems offer 30MB maximum file size to be processed.

Sponsors

Change Log

Change Log

Privacy Policy

Privacy Policy

License

MIT License


*Note that all licence references and agreements mentioned in the Ambar README section above are relevant to that project's source code only.