Cookies op Tweakers

Tweakers maakt gebruik van cookies, onder andere om de website te analyseren, het gebruiksgemak te vergroten en advertenties te tonen. Door gebruik te maken van deze website, of door op 'Ga verder' te klikken, geef je toestemming voor het gebruik van cookies. Wil je meer informatie over cookies en hoe ze worden gebruikt, bekijk dan ons cookiebeleid.

Meer informatie

Software-update: Xapian 0.8.1

Het ontwikkelteam van The Xapian Project heeft versie 0.8.1 van Xapian uitgebracht. Dit programma is een in C++ geschreven open source information retrieval library, en wordt gebruikt als de "engine" achter een zoekmachine. Het geheel omvat een eigen databaseformaat, API's om databases te bewerken en te doorzoeken, tools om databases te controleren en bindingsmogelijkheden voor andere talen zoals Java, PHP en Python. Wat de veranderingen zijn van Xapian 0.8.1 kan hieronder of in het changelog gelezen worden:

  • New method Xapian:: Database::get_lastdocid which returns the highest used document id for a database (useful for re-synchronizing an indexer which was interrupted). Implemented for quartz and inmemory.
  • Xapian::MSet::get_matches_*() methods now take collapsing into account, and the documentation has been clarified to state explicitly that collapsing and cutoffs are taken into account (bug#31).
  • Xapian::MSet: Need to adjust index by firstitem when indexing into items (bug#28).
  • MSetIterator and ESetIterator are now bidirectional iterators (rather than just input iterators)
  • Fixed post-increment forms of PostingIterator, TermIterator, PositionIterator, and ValueIterator so that *i++ works (as it must for them to be true input iterators).
  • Xapian::QueryParser: If we fail to parse a query, try stripping out non-alphanumerics (except '.') and reparsing.
  • Fixed memory leaked upon Xapian::QueryParser destruction.
  • Removed several unused Xapian::Error subclasses (these were used by the indexer framework which we decided was a failed experiment).
  • queryparsertest: Pruned near-duplicate queryparsertest testcases.
  • queryparsertest: Added test case for `term NOT "a phrase'.
  • remotetest: Use instead of localhost so that tcpmatch1 doesn't fail just because the network setup is broken.
  • apitest: Make emptyquery1 check that Query("") causes an InvalidArgumentError exception.
quartz backend:
  • Fixed bug which meant we sometimes failed to remove a posting when deleting or replacing a document.
  • Fixed PostlistChunkReader to take a copy of the postlist data being read to avoid problems with reading data from a string that's been deleted.
  • Fixed bug in postlist merging which could occasionally extend a postlist chunk to overlap the docid range of the next chunk.
  • Eliminated the split cursor in each Btree object - we only actually need a single block buffer to handle splitting blocks. This reduces the memory overhead of each Bcursor (and hence each QuartzPostList).
  • Changed 2 calls to abort() to throw Xapian:: DatabaseCorruptError instead,
  • If Btree is writable, throw DatabaseCorruptError if we detect overwritten.
  • Check the return value of fdatasync()/fsync()/_commit() and raise an error. If they fail, we really want to know as it could cause data corruption.
  • Assorted clean ups, improved comments, debug tracing, assertions.
  • When merging in postlist changes, removed an unneeded call to QuartzBufferedTable::get_or_make_tag() in a case when we're using a cursor which has already fetched the tag.
  • Added SON_OF_QUARTZ define to disable incompatible changes to database formats by default, and use it to control the docid encoding for keys such that we're always inserting at the end of the table when added new documents.
  • Reopening the readonly version of a writable Btree is now more efficient (we used to close and reopen all the files and destroy and recreate a lot of objects and buffers).
  • Share file descriptors between the read and write Btree objects so that a quartz WritableDatabase now uses 5 fds rather than 10.
  • Added configure test for glibc, because otherwise we need to include a header before we can check for glibc in order to define something we should be defining before we include any headers! Defining _XOPEN_SOURCE on OpenBSD seems to do the opposite to Linux and *disable
  • pread and pwrite!
  • Stripped out the session machinery - all that is actually required is to ensure that any unflushed changes are flushed when the destructor runs.
  • A few other backend interface cleanups.
build system:
  • Unified the shlib version numbers (the small benefit of tracking them individually makes it hard to justify the extra work required, and having one version simplifies debian packaging too).
  • Removed trivial m4/ and and autoconf/ and do the work from the top level instead. It's easier to see the structure this way, and it also removes a couple of recursive make invocations which will speed up builds a little.
  • HACKING: Added a list of subtasks when doing a release. Currently it's always me that does this, but it may not always be and anyhow it'll help me to have a list to run through.
  • include/xapian/database.h: Remove references to sessions in doxygen comments.
  • docs/quickstart.html: Corrected lingering reference to "om.h" and note that we need .
  • docs/,docs/, docs/ Add .
  • docs/quartzdesign.html: Corrected various pieces of out of date information, and improved wording in a couple of places.
  • docs/scalability.html: Removed the reference to the Quartz update bottleneck "currently being addressed for Xapian 0.8" as it's now been addressed! Also reworded to remove use of first person (it was originally a message sent to the mailing list).
[break]Naast Xapian is er de Omega-searchengine. Dit is een applicatie bovenop Xapian, die (ook via CGI) gebruikt kan worden om Xapian-databases te doorzoeken. Samen met Omega worden nog enkele tools meegeleverd die gebruikt kunnen worden om databases te vullen met data. Omdat de ontwikkeling van Omega nauw verbonden is met die van Xapian zelf, worden beide gelijktijdig gereleast onder hetzelfde versienummer. Omega heeft dus eveneens versienummer 0.8.1 meegekregen. In de release notes vinden we de volgende drie veranderingen:[/break]
  • omindex: Renamed hash() to hash_string() to avoid colliding with something on IRIX.
  • omega: Changed MORELIKE to pick up to 40 terms, rather than up to 6 (feedback on the mailing list suggests this gives much better results).
  • scriptindex: Added explicit catch for std::bad_alloc.[break]Gathering of Tweakers maakt al meer dan een jaar gebruik van Omega. Serverbeheerder Arjen van der Meijden heeft een document geschreven waarin de werking van de searchengine (en indirect Omega) wordt uitgelegd. Dat artikel is hier te vinden.
  • Versienummer 0.8.1
    Besturingssystemen Linux, BSD, macOS, Solaris
    Website The Xapian Project
    Bestandsgrootte 1,96MB
    Licentietype GPL

    Door Robin Vreuls


    01-07-2004 • 12:07

    1 Linkedin Google+

    Bron: The Xapian Project

    Reacties (1)

    Wijzig sortering
    ik vind de documentatie alleen wel magertjes...
    (of kijk ik verkeerd?)
    ik weet dat de hele api staat uitgelegd, maar een paar simpele voorbeelden hoe je bijvoorbeeld data uit een db kan inladen waren gewaardeerd :)

    Op dit item kan niet meer gereageerd worden.

    Apple iPhone XS Red Dead Redemption 2 LG W7 Google Pixel 3 XL OnePlus 6T FIFA 19 Samsung Galaxy S10 Google Pixel 3

    Tweakers vormt samen met Tweakers Elect, Hardware.Info, Autotrack, Nationale Vacaturebank en Intermediair de Persgroep Online Services B.V.
    Alle rechten voorbehouden © 1998 - 2018 Hosting door True