Cookies op Tweakers

Tweakers maakt gebruik van cookies, onder andere om de website te analyseren, het gebruiksgemak te vergroten en advertenties te tonen. Door gebruik te maken van deze website, of door op 'Ga verder' te klikken, geef je toestemming voor het gebruik van cookies. Wil je meer informatie over cookies en hoe ze worden gebruikt, bekijk dan ons cookiebeleid.

Meer informatie

Software-update: Xapian / Omega 1.0.3

Xapian is een in C++ geschreven 'open source information retrieval library' en kan gebruikt worden als engine achter een zoekmachine. Het geheel omvat een eigen databaseformaat, api's om databases te bewerken en te doorzoeken, tools om databases te controleren en koppelingsmogelijkheden voor andere talen zoals Java, Ruby, PHP en Python. Een applicatie die bovenop Xapian kan worden gebruikt is Omega, een zoekmachine om Xapian-databases te doorzoeken. Met Omega worden ook enkele tools meegeleverd die gebruikt kunnen worden om databases te vullen met data. Omdat de ontwikkeling van Omega nauw verbonden is met die van Xapian zelf, worden door de ontwikkelaars van beide programma's gelijktijdig nieuwe versies uitgebracht met hetzelfde versienummer.

Het ontwikkelteam van The Xapian Project heeft versie 1.0.3 van Xapian en Omega uitgebracht. De lijsten met veranderingen voor de verschillende onderdelen zien er als volgt uit:

Xapian-core 1.0.3:

  • Add support for user specified metadata (bug#143). Currently supported by the flint and inmemory backends.
  • Deprecate Enquire::register_match_decider() which has always been a no-op.
  • Improve the lower bound on the number of matching documents for an AND query - if the sum of the lower bounds for the two sides is greater than the number of documents in the database, then some of them must have both terms.
  • Spelling correction: Fix off-by-one error in loop bounds when initialising (bug#194).
  • If the check_at_least parameter to Enquire::get_mset() is used, but there aren't that many results, then MSet::get_matches_lower_bound() and MSet::get_matches_upper_bound() weren't always reported as equal - this bug is now fixed.
  • When sorting by value, and using the check_at_least parameter to Enquire::get_mset(), some potential matches weren't being counted.
  • Failing to create a flint or quartz database because we couldn't create the directory for it now throws DatabaseCreateError not DatabaseOpeningError.
  • Fix display of valgrind output when a test fails because valgrind detected a problem.
  • Add another version of valgrind suppression for the zlib end condition check as this gives a different backtrace for zlib in Ubuntu gutsy.
flint backend:
  • The Flint database format has been extended to support user metadata, and each termlist entry is now a byte shorter (before compression). As a result, Xapian 1.0.2 and earlier won't be able to read Xapian 1.0.3 databases. However, Xapian 1.0.3 can read older databases. If you open an older flint database for writing with Xapian 1.0.3, it will be upgraded such that it cannot then be read by Xapian 1.0.2 and earlier.
  • Zlib compression wasn't being used for the spelling or synonym tables (due to a typo - Z_DEFAULT_COMPRESSION where it should be Z_DEFAULT_STRATEGY).
  • xapian-check: Allow "db/record." and "db/record.DB" as arguments.
  • Fix "key too long" exception message by substituting FLINT_BTREE_MAX_KEY_LEN with its numeric value.
  • Assorted minor efficiency improvements.
  • If we reach the flush threshold during a transaction, we now write out the postlist changes, but don't actually commit them.
  • Check length of new terms is at most 245 bytes for flint in add_document() and replace_document() so that the API user gets an error there rather than when flush() is called (explicitly or implicitly). Fixes bug#44.
  • Flint used to read the value of the environmental variable XAPIAN_FLUSH_THRESHOLD when the first WritableDatabase was opened and would then cache this value. However the program using Xapian may have changed it, so we now reread it each time a WritableDatabase is opened.
  • Implement TermIterator::positionlist_count() for the flint backend.
remote backend:
  • Fix the result of MSet::get_matches_lower_bound() when using the check_at_least parameter to get_mset().
inmemory backend:
  • Implement TermIterator::positionlist_count() for the inmemory backend.
build system:
  • xapian-config: We always need to include dependency_libs in the output of `xapian-config --libs` if shared libraries are disabled.
  • Distribution tarballs are now in the POSIX "ustar" format. This supports pathnames longer than 99 characters (which we now have a few instances of in the doxygen generated documentation) and also results in a distribution tarball that is about half the size! This format should be readable by any tar program in current use - if your tar program doesn't support it, we'd like to know (but note that the GNU tar tarball is smaller than the size reduction in the xapian-core tarball...)
  • configure no longer generates msvc/version.h - this is now entirely handled by the MSVC-specific makefiles.
  • Add a glossary.
  • docs/stemming.html: Reorder the initial paragraphs so we actually answer the question "What is a stemming algorithm?" up front.
  • When running rst2html, use "--exit-status=warning" rather than "--strict". The former actually gives a non-zero exit status for a warning or worse, while the former doesn't, but does include any "info" messages in the output HTML.
  • docs/deprecation.rst: Add "Database::positionlist_begin() throwing RangeError and DocNotFoundError".
  • valueranges.rst: Correct out-of-date reference to float_to_string.
  • HACKING: Document a few more "coding standards".
  • PLATFORMS: Updated.
  • docs/overview.html: Restore HTML header accidentally deleted in November 2006.
  • Fix several typos.
  • Add missing instances of "#include <string.h>" to fix compilation with recent GCC 4.3 snapshots.
  • Fix some warnings for various compilers and platforms.

Omega 1.0.3:

  • Distribution tarballs are now in the POSIX "ustar" format since it saves a few KB and we need to use it for xapian-core anyway.
  • Expand the output of 'mbox2omega --help' and refer the reader to it from docs/scriptindex.txt.
  • omindex:
    • Add support for indexing AbiWord documents and TeX DVI files.
    • Impose a 5 minute CPU time limit on filter programs to prevent problems if a filter program goes into an infinite loop on a malformed input. Partly addresses bug#111.
  • scriptindex:
    • Fix line number tracking in dump files.
  • Add $muldiv{A,B,C} which calculates int(A*B/C).
  • Fix bug in decimal fraction in $size for files >= 1M in size.
  • query:
    • Set HTML charset to utf-8 since that's what databases now are by default.
    • Restyle to use CSS to draw a "score bar" instead of using images.
    • Rework the layout of each hit.
    • Add popup hints on mouse-over for various items.
    • Tidy up some HTML gremlins.

Xapian-bindings 1.0.3:

  • Wrap new methods Database::get_metadata() and WritableDatabase::set_metadata().
  • "make uninstall" now removes the loadable module we install for each of the bindings.
  • "make distcheck" now works.
  • Distribution tarballs are now in the POSIX "ustar" format since it saves about 40KB and we need to use it for xapian-core anyway.
  • RPMs: Package xapian.php.
  • Remove wrapper for ValueRangeProcessor::operator(), since it can't be usefully used currently.
  • Remove wrappers for the Muscat36 backend, which has now been dropped from the C++ library.
  • "make clean" now removes the class files generated for inner classes.
  • Add feature test for DateValueRangeProcessor when used with QueryParser.
  • ValueRangeProcessor::apply() can now be called from PHP (bug#193). This isn't actually very useful, since you can't subclass it in PHP currently.
  • Fixed wrapping of Enquire::set_cutoff() - previously this would only work if the third parameter was specified and a floating point number (e.g. 0.0).
  • php/docs/bindings.html: Fix errors in example code.
  • ValueRangeProcessor::operator() is now wrapped as a __call__ method in Python which takes two strings and returns a 3-tuple (value_number, modified_begin, modified_end). Previously this always failed with a type error, so this doesn't break existing code.
  • python/ Interpret any commandline arguments as a list of tests to be run (the default is to run all tests).
  • README,python/docs/bindings.html: Add a note about the problems with mod-python (as described in bug #185).
  • python/ Delete the database handles before deleting a database to fix problems running the Python tests on MS Windows (bug#179).
  • "make clean" now removes testsuite.pyc.
  • Check for RUBY_INC, RUBY_LIB, and RUBY_LIB_ARCH in the environment or on the configure command-line. The defaults for RUBY_LIB and RUBY_LIB_ARCH are now the site-specific directories, which is more correct when building from source. Debian packages, etc can override this by setting RUBY_LIB and RUBY_LIB_ARCH.
  • Check for TCL_LIB in the environment or on the configure command-line to allow installing without root access more cleanly.
[break]De volgende bestanden zijn binnen te halen:
* Xapian 1.0.3
* Omega 1.0.3
* Xapian bindings 1.0.3
Versienummer 1.0.3
Releasestatus Final
Besturingssystemen Windows 9x, Windows NT, Windows 2000, Linux, BSD, Windows XP, macOS, Solaris, UNIX, Windows Server 2003, Windows Vista
Website The Xapian Project
Licentietype GPL

Door Japke Rosink


02-10-2007 • 10:53

0 Linkedin

Bron: The Xapian Project


Er zijn nog geen reacties geplaatst

Op dit item kan niet meer gereageerd worden.

Apple iPhone SE (2020) Microsoft Xbox Series X LG CX Google Pixel 4a CES 2020 Samsung Galaxy S20 4G Sony PlayStation 5 Nintendo Switch Lite

'14 '15 '16 '17 2018

Tweakers vormt samen met Hardware Info, AutoTrack,, Nationale Vacaturebank, Intermediair en Independer DPG Online Services B.V.
Alle rechten voorbehouden © 1998 - 2020 Hosting door True