Showing 27 open source projects for ".pdf"

View related business solutions
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 1
    Easyspider - Distributed Web Crawler

    Easyspider - Distributed Web Crawler

    Easy Spider is a distributed Perl Web Crawler Project from 2006

    Easy Spider is a distributed Perl Web Crawler Project from 2006. It features code from crawling webpages, distributing it to a server and generating xml files from it. The client site can be any computer (Windows or Linux) and the Server stores all data. Websites that use EasySpider Crawling for Article Writing...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    OpenSearchServer Search Engine

    OpenSearchServer Search Engine

    An open source search engine with RESTFul API and crawlers

    ...Using the web user interface, the crawlers (web, file, database, etc.) and the client libraries (REST/API , Ruby, Rails, Node.js, PHP, Perl) you will be able to integrate quickly and easily advanced full-text search capabilities in your application: Full-text with basic semantic, join queries, boolean queries, facet and filter, document (PDF, Office, etc.) indexation, web scrapping,etc. OpenSearchServer runs on Windows and Linux/Unix/BSD.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3

    eXtensible Text Framework (XTF)

    Framework for search and display of heterogenous document collections.

    ...Please visit https://github.com/cdlib/xtf for the latest updates. Obsolete Description: The eXtensible Text Framework (XTF) is an architecture that supports searching across collections of heterogeneous textual data (XML, PDF, HTML, text, and more), and the presentation of results and documents in a highly configurable manner. Includes highly customized versions of the proven open-source components Lucene and Saxon.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 4
    Regain is a Java search engine based on Jakarta Lucene. It provides indexing and searching files for plenty of formats (HTML,XML,doc(x),xls(x),ppt(x),oo,PDF,RTF,mp3,mp4,Java). A TagLibrary eases integrating search results in your JSP based web page.
    Downloads: 21 This Week
    Last Update:
    See Project
  • Build AI Apps with Gemini 3 on Vertex AI Icon
    Build AI Apps with Gemini 3 on Vertex AI

    Access Google’s most capable multimodal models. Train, test, and deploy AI with 200+ foundation models on one platform.

    Vertex AI gives developers access to Gemini 3—Google’s most advanced reasoning and coding model—plus 200+ foundation models including Claude, Llama, and Gemma. Build generative AI apps with Vertex AI Studio, customize with fine-tuning, and deploy to production with enterprise-grade MLOps. New customers get $300 in free credits.
    Try Vertex AI Free
  • 5
    IDRA (InDexing and Retrieving Automatically) is a tool which allows indexing a wide range of text (TXT, DOC, PDF) and image annotations files (XML), query-based searching, visualizing an index, saving it for re-usability, evaluation, etc.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    phpShare&Search

    phpShare&Search

    Group file share with advanced text parsing capability for easy search

    Originally created as a church resource sharing system, phpShare&Search allows users to create accounts, share documents, search documents, and like or report documents. phpShare&Search's power comes from its advanced document parser which extracts text from .PDF, .TXT, .DOC, and .DOCX files and its community features of liking resources and reporting them as inappropriate or SPAM. Users also subscribe to weekly updates of new content. User's may choose to download and host/install/configure/modify/manage this code themselves, or contract the code writer to do these functions for them. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    JavaScript SQL (JSSQL)

    JavaScript SQL (JSSQL)

    A database engine entirely in JavaScript (AJAX)

    JSSQL is a database engine developed entirely with JavaScript. It is a tool for developers to interpret and execute SQL statements on a offline database. It has a conversion class developed in PHP for use with relational databases (ex. PostgreSQL, MySQL, etc..) that generates a series of data files, then the database engine in JavaScript (JSSQL) accesses the data through SQL querys and return a set of records, similar to any database engine. This is useful for querying offline databases...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    DocInfoRetriever is a Web_based document full-text search engine based on lucene. It allows you to search the contents and metadata of documents . Supported document formats, likes doc, xls, pdf, odt, jpg...etc.,and torrent files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Google Documents Finder
    Mesin pencari berkas doc, docx, ppt, pptx dan pdf (Open Source). Created by : X-Cisadane (Dwi). Greetz to : XCode, Dunia Santai, Depok Cyber, Borneo Crew, Muslim Hackers, Hacker Cisadane, UG-HotZone 567.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Managed MySQL, PostgreSQL, and SQL Databases on Google Cloud Icon
    Managed MySQL, PostgreSQL, and SQL Databases on Google Cloud

    Get back to your application and leave the database to us. Cloud SQL automatically handles backups, replication, and scaling.

    Cloud SQL is a fully managed relational database for MySQL, PostgreSQL, and SQL Server. We handle patching, backups, replication, encryption, and failover—so you can focus on your app. Migrate from on-prem or other clouds with free Database Migration Service. IDC found customers achieved 246% ROI. New customers get $300 in credits plus a 30-day free trial.
    Try Cloud SQL Free
  • 10
    Booletin es un buscador de Boletines oficiales (BOE, BOCM, etc.), que incluye un sistema de alertas por correo electrónico. Utiliza Apache Lucene para indexar el contenido en pdf de los boletines oficiales de España.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Script for automated downloading of PDFs from Guardian's subscription service (guardian.newspaperdirect.com, used to be digital.guardian.co.uk)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    A utility to extract meta-information (properties/comments) out of various file-types; e.g. HTML, PDF, RTF & various Office documents; OGG/MP3 files and JPEG/PNG/GIF images, which can be presented in various output formats (HTML, XML, LaTeX & plain t
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    PDFBox is a Java PDF Library. This project will allow access to all of the components in a PDF document. More PDF manipulation features will be added as the project matures. This ships with a utility to take a PDF document and output a text file.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 14
    Webfilemanger, written in OO-Php, with fulltext retrieval capabilities (just for PDF files at the moment...). Interface similar to explorer/konqueror, with tree structure on the left side. mod_mysql_auth integrated to grant user control, and OWASP ph
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    YetAnotherMediaLibrary is a database application, that allows users to index their personal documents and media like images, books, pdf, cds, dvds etc. The library is accessed through a themeable web interface. The Database backend is MySQL.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    A web-based search interface tailored to the New Zealand Gazette PDF archive for the NZ library community. A generic Python-based Swish-e search interface.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Spencer is a Java-based, web-hosted filesystem indexing application. It indexes files on network shares, reads inside MSOffice, Open/StarOffice, PDF and zip files and provides a web interface to the index with search functions to find the file you want.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    This projects implements a complete entreprise solution based on lucene. It's a smart engine implemented to index numerous files formats (pdf, ps, xls, doc, ppt, ). The engine can index file systems (filtering), databases, mailing folders, web sites and
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Satellite is a Perl website index/search package meant for indexing and searching medium size websites. Satellite currently supports text (.txt, .html etc) and pdf files. <br><br><a href=http://satellite2.sourceforge.net>Go here for a demo</a>
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    JSSindex (The JavaScript Search Engine) provides full-text search for collections of documents in HTML, PS, PDF, and DjVu. The index and query engine are entirely contained in JavaScript/HTML files. Therefore, searching merely requires a Web browser.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    E-Xoops Digger is an advanced search engine for Xoops and E-Xoops. Features are content indexing (like pdf, doc, xls), display results by rank, limit to a certain module, results with text spnippet highlighted, fuzzy search, exact search....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Kassandra is an SQL-based Latent Semantic Indexing and search engine written mostly in PHP. Supported formats will be at least HTML, Postscript and PDF.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    yaDMS, which stands for yet another Document Management System, is a php based DMS, with many Features like Clipboard, Mail2DMS, DMS2Mail, Zip&Download, Copy, Move, Multiuser, Fulltextsearch (doc,pdf,rtf,txt,mp3, external tools needed).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    100% Java multithread search engine. Communication between the client and server is transferred through TCP-IP. To index objects, it obtains the documents through HTTP protocol and parses HTML files, PDF files, XML files and Text Plain files. Artlight use
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    A SOAP-based Document/File-Sharing solution written in Java. It includes a basic web-interface but other clients are possible. You can share and download all common office document formats like MS Word, Excel, OpenOffice and PDF.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next