NAME

KinoSearch::InvIndexer - Build inverted indexes.

SYNOPSIS

    use KinoSearch::InvIndexer;
    use MySchema;

    my $invindexer = KinoSearch::InvIndexer->new(
        invindex => MySchema->clobber('/path/to/invindex'),
    );

    while ( my ( $title, $content ) = each %source_docs ) {
        $invindexer->add_doc({
            title   => $title,
            content => $content,
        });
    }

    $invindexer->finish;

DESCRIPTION

The InvIndexer class is KinoSearch's primary tool for managing the content of inverted indexes, which may later be searched using KinoSearch::Searcher.

Only one InvIndexer may write to an invindex at a time. If a write lock cannot be secured, new() will throw an exception.

If an index is located on a shared volume, each writer application must identify itself by passing a LockFactory to InvIndexer's constructor or index corruption will occur. See LockFactory's documentation for a detailed explanation.

CONSTRUCTOR

new( [labeled params] )

    my $invindexer = KinoSearch::InvIndexer->new(
        invindex     => $invindex,  # required
        lock_factory => $factory    # default: created internally 
    );

METHODS

add_doc(doc)

    my $doc = KinoSearch::Doc->new( 
        fields => { field_name => $field_value },
        boost  => 2.5,
    );
    $invindexer->add_doc($doc);
    
    # or...
    $invindexer->add_doc( { field_name => $field_value } );

Add a document to the invindex. Accepts a single argument which may be either a KinoSearch::Doc object, or a hashref (which will be used to create a KinoSearch::Doc object internally).

add_invindex(invindex)

Absorb an existing invindex into this one. The two invindexes must have matching Schemas.

finish( [labeled params] )

Finish processing any changes made to the invindex and commit. Until the commit happens near the end of the finish(), none of the changes made during an indexing session are permanent.

Calling finish() invalidates the InvIndexer, so if you want to make more changes you'll need a new one.

Takes one labeled parameter:

  • optimize - If optimize is set to 1, the invindex will be collapsed to its most compact form, a process which may take a while -- but which will yield the fastest queries at search time.

delete_by_term( [labeled params] )

Mark documents which contain the supplied term as deleted, so that they will be excluded from search results. The change is not apparent to search apps until a new Searcher is opened after finish() completes.

delete_by_term() only affects documents already committed to the index; docs added during this session (via add_doc() or add_invindex()) will not deleted.

  • field - The name of an indexed field. (If it is not spec'd as indexed, an error will occur.)
  • term - The term which identifies docs to be marked as deleted. If field is associated with an analyzer, term will be processed automatically (so don't pre-process it yourself).

INHERITANCE

KinoSearch::InvIndexer isa KinoSearch::Obj.

COPYRIGHT

Copyright 2005-2008 Marvin Humphrey

LICENSE, DISCLAIMER, BUGS, etc.

See KinoSearch version 0.20.

Copyright © 2004-2008 Marvin Humphrey