|
NAMEurlwatch-intro - Introduction to basic urlwatch usageQUICK START
The checking interval is defined by how often you run urlwatch. You can use e.g. crontab.guru <https://crontab.guru> to figure out the schedule expression for the checking interval, we recommend not more often than 30 minutes (this would be */30 * * * *). If you have never used cron before, check out the crontab command help <https://www.computerhope.com/unix/ucrontab.htm>. On Windows, cron is not installed by default. Use the Windows Task Scheduler <https://en.wikipedia.org/wiki/Windows_Task_Scheduler> instead, or see this StackOverflow question <https://stackoverflow.com/q/132971/1047040> for alternatives. HOW IT WORKSEvery time you run urlwatch(1), it:
JOBS AND FILTERSEach website or shell command to be monitored constitutes a "job".The instructions for each such job are contained in a config file in the YAML format <https://yaml.org/spec/>. If you have more than one job, you separate them with a line containing only ---. You can edit the job and filter configuration file using: urlwatch --edit If you get an error, set your $EDITOR (or $VISUAL) environment variable in your shell, for example: export EDITOR=/bin/nano While you can edit the YAML file manually, using --edit will do sanity checks before activating the new configuration file. Kinds of JobsEach job must have exactly one of the following keys, which also defines the kind of job:
Each job can have an optional name key to define a user-visible name for the job. You can then use optional keys to finely control various job's parameters. See urlwatch-jobs(5) for detailed information on job configuration. FiltersYou may use the filter key to select one or more filters to apply to the data after it is retrieved, for example to:
These filters can be chained. As an example, after retrieving an HTML document by using the url key, you can extract a selection with the xpath filter, convert this to text with html2text, use grep to extract only lines matching a specific regular expression, and then sort them: name: "Sample urlwatch job definition" url: "https://example.dummy/" https_proxy: "http://dummy.proxy/" max_tries: 2 filter: - xpath: '//section[@role="main"]' - html2text: method: pyhtml2text unicode_snob: true body_width: 0 inline_links: false ignore_links: true ignore_images: true pad_tables: false single_line_break: true - grep: "lines I care about" - sort: --- See urlwatch-filters(5) for detailed information on filter configuration. REPORTERSurlwatch can be configured to do something with its report besides (or in addition to) the default of displaying it on the console.reporters are configured in the global configuration file: urlwatch --edit-config Examples of reporters:
See urlwatch-reporters(5) for reporter configuration options. SEE ALSOurlwatch(1), urlwatch-jobs(5), urlwatch-filters(5), urlwatch-config(5), urlwatch-reporters(5), cron(8)COPYRIGHT2022 Thomas Perl
Visit the GSP FreeBSD Man Page Interface. |