Module util

This module contains miscellaneous helper functions for the KOReader frontend.

Functions

stripPunctuation (text) Strips all punctuation marks and spaces from a string.
rtrim (s) Remove trailing whitespace from string.
trim (s) Remove leading & trailing whitespace from string.
gsplit (str, pattern, capture, capture_empty_entity) Splits a string by a pattern

Lua doesn't have a string.split() function and most of the time you don't really need it because string.gmatch() is enough.

secondsToClock (seconds, withoutSeconds) Converts seconds to a clock string.
secondsToHClock (seconds, withoutSeconds, hmsFormat) Converts seconds to a period of time string.
secondsToClockDuration (Either, withoutSeconds, hmsFormat) Converts seconds to a clock type (classic or modern), based on the given format preference "Classic" format calls secondsToClock, and "Modern" format calls secondsToHClock
secondsToHour (seconds, twelve_hour_clock) Converts timestamp to an hour string
secondsToDate (seconds, twelve_hour_clock) Converts timestamp to a date string
tableEquals (o1, o2, ignore_mt) Compares values in two different tables.
tableDeepCopy (o) Makes a deep copy of a table.
tableSize (t) Returns number of keys in a table.
arrayAppend (t1, t2) Append all elements from t2 into t1.
arrayReverse (t) Reverse array elements in-place in table t
arrayContains (t, v, callback) Test whether t contains a value equal to v (or such a value that callback returns true), and if so, return the index.
arrayReferences (t, n, m) Test whether array t contains a reference to array n (at any depth at or below m)
lastIndexOf (string, ch) Gets last index of character in string (i.e., strrchr)

Returns the index within this string of the last occurrence of the specified character or -1 if the character does not occur.

utf8Reverse (string) Reverse the individual greater-than-single-byte characters
splitToChars (text) Splits string into a list of UTF-8 characters.
isCJKChar (c) Tests whether c is a CJK character
hasCJKChar (str) Tests whether str contains CJK characters
splitToWords (text) Split texts into a list of words, spaces and punctuation marks.
isSplittable (c, next_c, prev_c) Test whether a string can be separated by this char for multi-line rendering.
getFilesystemType (path) Gets filesystem type of a path.
findFiles (path, callback) Recursively scan directory for files inside
isEmptyDir (path) Checks if directory is empty.
fileExists (path) check if the given path is a file
pathExists (path) Checks if the given path exists.
makePath (path) As mkdir -p.
removeFile (path) As rm
getSafeFilename (str, path, limit) Replaces characters that are invalid in filenames.
splitFilePathName (file) Splits a file into its directory path and file name.
splitFileNameSuffix (file) Splits a file name into its pure file name and suffix
getFileNameSuffix (filename) Gets file extension
getScriptType (filename) Companion helper function that returns the script's language, based on the file extension.
getFriendlySize (size, right_align) Gets human friendly size as string
getFormattedSize (size) Gets formatted size as string (1273334 => "1,273,334")
fixUtf8 (str, replacement) Replaces invalid UTF-8 characters with a replacement string.
splitToArray (str, splitter, capture_empty_entity) Splits input string with the splitter into a table.
unicodeCodepointToUtf8 (c) Convert a Unicode codepoint (number) to UTF-8 char c.f., https://stackoverflow.com/a/4609989

 & <https://stackoverflow.com/a/38492214>

See utf8charcode in ffi/util for a decoder.

htmlEntitiesToUtf8 (string) Replace HTML entities with their UTF-8 encoded equivalent in text.
htmlToPlainText (text) Convert simple HTML to plain text.
htmlToPlainTextIfHtml (text) Convert HTML to plain text if text seems to be HTML Detection of HTML is simple and may raise false positives or negatives, but seems quite good at guessing content type of text found in EPUB's .
htmlEscape (text) Encode the HTML entities in a string
prettifyCSS (CSS) Prettify a CSS stylesheet Not perfect, but enough to make some ugly CSS readable.
urlEncode (text) Encode URL also known as percent-encoding see https://en.wikipedia.org/wiki/Percent-encoding
urlDecode (text) Decode URL (reverse process to util.urlEncode())
checkLuaSyntax (text) Check lua syntax of string
stringStartsWith (str, start) Simple startsWith string helper.
stringEndsWith (str, ending) Simple endsWith string helper.

Tables

t Remove elements from an array, fast.
args Escape list for shell usage
t Clear all the elements from a table without reassignment.


Functions

stripPunctuation (text)
Strips all punctuation marks and spaces from a string.

Parameters:

  • text string the string to be stripped

Returns:

    string stripped text
rtrim (s)
Remove trailing whitespace from string.

Parameters:

  • s string the string to be trimmed

Returns:

    string trimmed text
trim (s)
Remove leading & trailing whitespace from string.

Parameters:

  • s string the string to be trimmed

Returns:

    string trimmed text
gsplit (str, pattern, capture, capture_empty_entity)
Splits a string by a pattern

Lua doesn't have a string.split() function and most of the time you don't really need it because string.gmatch() is enough. However string.gmatch() has one significant disadvantage for me: You can't split a string while matching both the delimited strings and the delimiters themselves without tracking positions and substrings. The gsplit function below takes care of this problem.

Author: Peter Odding

License: MIT/X11

Source: http://snippets.luacode.org/snippets/Stringsplitting130

Parameters:

  • str string string to split
  • pattern the pattern to split against
  • capture bool
  • capture_empty_entity bool
secondsToClock (seconds, withoutSeconds)
Converts seconds to a clock string.

Source: https://gist.github.com/jesseadams/791673

Parameters:

  • seconds int number of seconds
  • withoutSeconds bool if true 00:00, if false 00:00:00

Returns:

    string clock string in the form of 00:00 or 00:00:00
secondsToHClock (seconds, withoutSeconds, hmsFormat)
Converts seconds to a period of time string.

Parameters:

  • seconds int number of seconds
  • withoutSeconds bool if true 1h30', if false 1h30'10''
  • hmsFormat bool , if true format 1h30m10s

Returns:

    string clock string in the form of 1h30' or 1h30'10''
secondsToClockDuration (Either, withoutSeconds, hmsFormat)
Converts seconds to a clock type (classic or modern), based on the given format preference "Classic" format calls secondsToClock, and "Modern" format calls secondsToHClock

Parameters:

  • Either string "modern" for 1h30' or "classic" for 1:30
  • withoutSeconds bool if true 1h30' or 1:30, if false 1h30'10'' or 1:30:10
  • hmsFormat bool , modern format only, if true format 1h30m10s

Returns:

    string clock string in the specific format of 1h30', 1h30'10'' or 1:30'
secondsToHour (seconds, twelve_hour_clock)
Converts timestamp to an hour string

Parameters:

  • seconds int number of seconds
  • twelve_hour_clock bool

Returns:

    string hour string
secondsToDate (seconds, twelve_hour_clock)
Converts timestamp to a date string

Parameters:

  • seconds int number of seconds
  • twelve_hour_clock bool

Returns:

    string date string
tableEquals (o1, o2, ignore_mt)
Compares values in two different tables.

Source: https://stackoverflow.com/a/32660766/2470572

Parameters:

  • o1 Lua table
  • o2 Lua table
  • ignore_mt bool

Returns:

    boolean
tableDeepCopy (o)
Makes a deep copy of a table.

Source: https://stackoverflow.com/a/16077650/2470572

Parameters:

  • o Lua table

Returns:

    Lua table
tableSize (t)
Returns number of keys in a table.

Parameters:

  • t Lua table

Returns:

    int number of keys in table t
arrayAppend (t1, t2)
Append all elements from t2 into t1.

Parameters:

  • t1 Lua table
  • t2 Lua table
arrayReverse (t)
Reverse array elements in-place in table t

Parameters:

  • t Lua table
arrayContains (t, v, callback)
Test whether t contains a value equal to v (or such a value that callback returns true), and if so, return the index.

Parameters:

  • t Lua table
  • v
  • callback func (v1, v2)
arrayReferences (t, n, m)
Test whether array t contains a reference to array n (at any depth at or below m)

Parameters:

  • t Lua table (array only)
  • n Lua table (array only)
  • m int Max nesting level
lastIndexOf (string, ch)
Gets last index of character in string (i.e., strrchr)

Returns the index within this string of the last occurrence of the specified character or -1 if the character does not occur.

To find . you need to escape it.

Parameters:

Returns:

    int last occurrence or -1 if not found
utf8Reverse (string)
Reverse the individual greater-than-single-byte characters

Parameters:

splitToChars (text)
Splits string into a list of UTF-8 characters.

Parameters:

  • text string the string to be split.

Returns:

    table list of UTF-8 chars
isCJKChar (c)
Tests whether c is a CJK character

Parameters:

Returns:

    boolean true if CJK
hasCJKChar (str)
Tests whether str contains CJK characters

Parameters:

Returns:

    boolean true if CJK
splitToWords (text)
Split texts into a list of words, spaces and punctuation marks.

Parameters:

Returns:

    table list of words, spaces and punctuation marks
isSplittable (c, next_c, prev_c)
Test whether a string can be separated by this char for multi-line rendering. Optional next or prev chars may be provided to help make the decision

Parameters:

Returns:

    boolean true if splittable, false if not
getFilesystemType (path)
Gets filesystem type of a path.

Checks if the path occurs in /proc/mounts

Parameters:

  • path string an absolute path

Returns:

    string filesystem type
findFiles (path, callback)
Recursively scan directory for files inside

Parameters:

  • path string
  • callback func (fullpath, name, attr)
isEmptyDir (path)
Checks if directory is empty.

Parameters:

Returns:

    bool
fileExists (path)
check if the given path is a file

Parameters:

Returns:

    bool
pathExists (path)
Checks if the given path exists. Doesn't care if it's a file or directory.

Parameters:

Returns:

    bool
makePath (path)
As mkdir -p. Unlike lfs.mkdir(), does not error if the directory already exists, and creates intermediate directories as needed.

Parameters:

  • path string the directory to create

Returns:

    bool true on success; nil, err_message on error
removeFile (path)
As rm

Parameters:

  • path string of the file to remove

Returns:

    bool true on success; nil, err_message on error
getSafeFilename (str, path, limit)
Replaces characters that are invalid in filenames.

Replaces the characters \/:*?"<>| with an _ unless an optional path is provided. These characters are problematic on Windows filesystems. On Linux only the / poses a problem.

If an optional path is provided, util.getFilesystemType() will be used to determine whether stricter VFAT restrictions should be applied.

Parameters:

Returns:

    string safe filename
splitFilePathName (file)
Splits a file into its directory path and file name. If the given path has a trailing /, returns the entire path as the directory path and "" as the file name.

Parameters:

Returns:

    string directory, filename
splitFileNameSuffix (file)
Splits a file name into its pure file name and suffix

Parameters:

Returns:

    string path, extension
getFileNameSuffix (filename)
Gets file extension

Parameters:

Returns:

    string extension
getScriptType (filename)
Companion helper function that returns the script's language, based on the file extension.

Parameters:

Returns:

    string (lowercase) (or nil if not Device:canExecuteScript(file))
getFriendlySize (size, right_align)
Gets human friendly size as string

Parameters:

  • size int (bytes)
  • right_align bool (by padding with spaces on the left)

Returns:

    string
getFormattedSize (size)
Gets formatted size as string (1273334 => "1,273,334")

Parameters:

  • size int (bytes)

Returns:

    string
fixUtf8 (str, replacement)
Replaces invalid UTF-8 characters with a replacement string.

Based on http://notebook.kulchenko.com/programming/fixing-malformed-utf8-in-lua. c.f., FixUTF8 @ https://github.com/pkulchenko/ZeroBraneStudio/blob/master/src/util.lua.

Parameters:

  • str string the string to be checked for invalid characters
  • replacement string the string to replace invalid characters with

Returns:

    string valid UTF-8
splitToArray (str, splitter, capture_empty_entity)
Splits input string with the splitter into a table. This function ignores the last empty entity.

Parameters:

  • str string the string to be split
  • splitter string
  • capture_empty_entity bool

Returns:

    an array-like table
unicodeCodepointToUtf8 (c)
Convert a Unicode codepoint (number) to UTF-8 char c.f., https://stackoverflow.com/a/4609989

 & <https://stackoverflow.com/a/38492214>

See utf8charcode in ffi/util for a decoder.

Parameters:

  • c int Unicode codepoint

Returns:

    string UTF-8 char
htmlEntitiesToUtf8 (string)
Replace HTML entities with their UTF-8 encoded equivalent in text.

Supports only basic ones and those with numbers (no support for named entities like &eacute;).

Parameters:

  • string int text with HTML entities

Returns:

    string UTF-8 text
htmlToPlainText (text)
Convert simple HTML to plain text.

This may fail on complex HTML (with styles, scripts, comments), but should be fine enough with simple HTML as found in EPUB's <dc:description>.

Parameters:

Returns:

    string plain text
htmlToPlainTextIfHtml (text)
Convert HTML to plain text if text seems to be HTML Detection of HTML is simple and may raise false positives or negatives, but seems quite good at guessing content type of text found in EPUB's .

Parameters:

  • text string the string with possibly some HTML

Returns:

    string cleaned text
htmlEscape (text)
Encode the HTML entities in a string

Parameters:

  • text string the string to escape Taken from https://github.com/kernelsauce/turbo/blob/e4a35c2e3fb63f07464f8f8e17252bea3a029685/turbo/escape.lua#L58-L70
prettifyCSS (CSS)
Prettify a CSS stylesheet Not perfect, but enough to make some ugly CSS readable. By default, each selector and each property is put on its own line. With condensed=true, condense each full declaration on a single line.

Parameters:

Returns:

    string the CSS prettified
urlEncode (text)
Encode URL also known as percent-encoding see https://en.wikipedia.org/wiki/Percent-encoding

Parameters:

  • text string the string to encode

Returns:

    encode string Taken from https://gist.github.com/liukun/f9ce7d6d14fa45fe9b924a3eed5c3d99
urlDecode (text)
Decode URL (reverse process to util.urlEncode())

Parameters:

  • text string the string to decode

Returns:

    decode string Taken from https://gist.github.com/liukun/f9ce7d6d14fa45fe9b924a3eed5c3d99
checkLuaSyntax (text)
Check lua syntax of string

Parameters:

Returns:

    string with parsing error, nil if syntax ok
stringStartsWith (str, start)
Simple startsWith string helper.

C.f., http://lua-users.org/wiki/StringRecipes.

Parameters:

Returns:

    bool true on success
stringEndsWith (str, ending)
Simple endsWith string helper.

Parameters:

Returns:

    bool true on success

Tables

t
Remove elements from an array, fast.

Swap & pop, like http://lua-users.org/lists/lua-l/2013-11/msg00027.html / https://stackoverflow.com/a/28942022, but preserving order. c.f., https://stackoverflow.com/a/53038524

Fields:

  • keep_cb func Filtering callback. Takes three arguments: table, index, new index. Returns true to keep the item. See link above for potential uses of the third argument.

Usage:

    local foo = { "a", "b", "c", "b", "d", "e" }
    local function drop_b(t, i, j)
        -- Discard any item with value "b"
        return t[i] ~= "b"
    end
    util.arrayRemove(foo, drop_b)
args
Escape list for shell usage
t
Clear all the elements from a table without reassignment.
generated by LDoc 1.4.6 Last updated 2021-09-20 13:36:42