pywikitools.resourcesbot.export_html#
Module Contents#
Classes#
Class to collect all images used in the generated HTML files |
|
Export all finished worksheets of this language as HTML into a folder |
- class pywikitools.resourcesbot.export_html.CustomBeautifyHTML(change_hrefs: Dict[str, str], file_collector: Set[str])#
Bases:
pywikitools.htmltools.beautify_html.BeautifyHTMLClass to collect all images used in the generated HTML files TODO do something about links to worksheets that are not translated yet
- img_rewrite_handler(self, element)#
Do some rewriting of <img> elements
In our default implementation we remove the srcset attribute (as we don’t need it) and apply replacements for the src attribute.
You can customize the behaviour by sub-classing BeautifyHTML and overwriting this method @param element: Part of the BeautifulSoup data structure, will be modified directly
- class pywikitools.resourcesbot.export_html.ExportHTML(fortraininglib: pywikitools.fortraininglib.ForTrainingLib, folder: str, *, force_rewrite: bool = False)#
Bases:
pywikitools.resourcesbot.post_processing.LanguagePostProcessorExport all finished worksheets of this language as HTML into a folder This is a step towards having a git repo with this content always up-to-date
- has_relevant_change(self, worksheet: str, change_log: pywikitools.resourcesbot.changes.ChangeLog) bool#
Is there a relevant change for worksheet? TODO: Define what exactly we consider relevant (for re-generating that worksheet’s HTML)
- download_file(self, files_folder: str, filename: str) bool#
Download a file from the mediawiki server
If a file already exists locally, we don’t download it again because usually those files (graphics) don’t change. TODO: Implement a way to force re-downloading of files (in case a file was updated in the mediawiki system). Two possible ways: - an extra flag (e.g. –force-rewrite-files) - by getting the time stamp of the file in the mediawiki system, comparing it with the last modified timestamp of the local file and download again if the first is newer (would require adjustments of get_file_url() to also request timestamp)
@return True if we actually downloaded the file, False if not
- run(self, language_info: pywikitools.resourcesbot.data_structures.LanguageInfo, english_info: pywikitools.resourcesbot.data_structures.LanguageInfo, change_log: pywikitools.resourcesbot.changes.ChangeLog)#
Entry point