This is a lot simpler from a concurrency perspective. Training values
can get committed to the database immediately, rather than in
long-running flat file batches.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
If an HTML title was parsed with whitespace, it would not strip that
surrounding whitespace. This fixes that.
Also, there are some new debug log messages in linkbot. Hooray!
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
Previously, the environment variable would take priority over the
command line argument. This is now reversed.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
Just like the actual chain data structure, this value is now loaded
lazily, since it's stored in the filesystem.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
This allows markov to save (hopefully) in parallel using a
ProcessPoolExecutor. Since objects are sent over-the-wire and copied,
pruning in parallel is not an issue.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
This moves the self.__touch() call around in markov's Chain class such
that it will only access truly available data.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
* Chain.__touch() is a new function that updates the last time a markov
chain was accessed
* Fix a bug that would not reliably update the last access time of the
chain during Chain.add()
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
* Log levels can now be set via the command line and the configuration
file.
* ServerConfig.load() function takes a file-like object now, rather than
a string
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
Markov chains used to prune the chains themselves from memory, but now
that behavior is specifically delegated up the chain to the Bot
structure instead.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
* Linkbot parser also looks for <meta> tags and uses an actual HTML
parser.
* Inner title HTML is decoded before being displayed.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
There's a long explanation in the code of this commit that says this:
> TL;DR OF THE BELOW: if the first parameter looks like a channel in
> addition to message type, then filter by channel. Otherwise, don't
> filter by channel.
>
> Here's the issue: plugins are *usually* multiplexed by channel. But
> that's only for messages that target channels, such as PRIVMSG and JOIN.
> For non-channel messages, such as server status messages (such as 001 on
> connect, or 372 for MOTD, etc) we want to ignore the channel aspect of
> plugin multiplexing. In order to accomplish this, we just check if the
> first parameter looks like a channel - i.e., starts with an octothorpe #.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
This allows plugins to specify the types of messages they handle. This
will be used specifically for the nickserv plugin, but could be useful
for other things too.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
When a user's name is used in the !wordbot leaderboard command, we make
every effort to not ping them by interleaving zero-width space
characters in the nickname.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
This matches the old database format that was written a while back.
There's an "end_now" command that's been left in there for debugging
purposes, that'll be gone soon enough.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
Plugins now use Bots instead of server_configs, this is useful for
checking the currently joined channels and perhaps using the connection
when there isn't one available in the current method.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
* .items() call required when loading a markov chain into memory
* `who.nick` instead of `who~ for get_chain call
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
* on_connect is called on all plugins on the first message received from
the IRC server
* joined_channels property gets all of the channels that this bot has
currently joined in IRC
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
This will just send a message to the user who just updated their markov
reply chance with the final value it was set to, so there are no
surprises.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
Whenever someone says something, there's a chance that markov will
interject his opinion. Users can also set the chance between 0.0 and the
default value (in the config) if they want to see markov replies less
often.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
If something is changed in a markov chain it gets flagged as dirty,
which is used to determine whether the chain should be saved.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
If you don't use/access your chain every N seconds (300 by default), it
will unload your chain from memory and save it to disk.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>