Scrapers¶
Base¶
-
class
scrapers.base.
BaseScraper
(teiid_user=None, teiid_password=None, kerberos=False, neo4j_user='neo4j', neo4j_password='neo4j', neo4j_server='localhost', neo4j_scheme='bolt')[source]¶ Base scraper class to standardize the main scraper functionality.
-
default_since
= '2020-05-18'¶
-
default_until
= '2021-05-19'¶
-
run
(since=None)[source]¶ Run the scraper.
- Parameters
since (str) – a datetime to start scraping data from
- Raises
NotImplementedError – if the function is not overridden
-
teiid_host
= 'virtualdb.engineering.redhat.com'¶
-
teiid_port
= 5432¶
-
Bugzilla¶
-
class
scrapers.bugzilla.
BugzillaScraper
(teiid_user=None, teiid_password=None, kerberos=False, neo4j_user='neo4j', neo4j_password='neo4j', neo4j_server='localhost', neo4j_scheme='bolt')[source]¶ Scrapes the Bugzilla tables in Teiid.
-
create_user_node
(email)[source]¶ Create a User node in Neo4j.
- Parameters
email (str) – the user’s email
- Returns
User object
-
get_bugzilla_bugs
(start_date, end_date)[source]¶ Get the Buzilla bugs information from Teiid.
- Parameters
start_date (datetime.datetime) – when to start scraping data from
end_date (datetime.datetime) – determines until when to scrape data
- Returns
list of dictionaries containing bug info
- Return type
-
DistGit¶
-
class
scrapers.distgit.
DistGitScraper
(teiid_user=None, teiid_password=None, kerberos=False, neo4j_user='neo4j', neo4j_password='neo4j', neo4j_server='localhost', neo4j_scheme='bolt')[source]¶ Scrapes the GitBZ tables in Teiid.
-
get_distgit_data
(since, until)[source]¶ Query Teiid for the dist-git commit and Bugzilla information.
- Parameters
since (datetime.datetime) – determines when to start the query
until (datetime.datetime) – determines until when to scrape data
- Returns
a list of dictionaries
- Return type
-
Errata¶
-
class
scrapers.errata.
ErrataScraper
(teiid_user=None, teiid_password=None, kerberos=False, neo4j_user='neo4j', neo4j_password='neo4j', neo4j_server='localhost', neo4j_scheme='bolt')[source]¶ Scrapes the Errata Tool tables in Teiid.
-
get_advisories
(since, until)[source]¶ Query Teiid for the Errata Tool advisories.
- Parameters
since (datetime.datetime) – determines when to start querying
until (datetime.datetime) – determines until when to scrape data
- Returns
a list of dictionaries
- Return type
-
get_associated_builds
(advisory_id)[source]¶ Query Teiid to find the Brew builds associated with a specific advisory.
-
get_attached_bugs
(advisory_id)[source]¶ Query Teiid to find the Bugzilla bugs attached to a specific advisory.
-
get_koji_build
(build_id)[source]¶ Query Teiid to find the Koji build attached to a specific advisory.
-
Freshmaker¶
-
class
scrapers.freshmaker.
FreshmakerScraper
(teiid_user=None, teiid_password=None, kerberos=False, neo4j_user='neo4j', neo4j_password='neo4j', neo4j_server='localhost', neo4j_scheme='bolt')[source]¶ Scrapes the Freshmaker API.
-
freshmaker_url
= 'https://freshmaker.engineering.redhat.com/api/2/events/?per_page=50'¶
-
Koji¶
-
class
scrapers.koji.
KojiScraper
(teiid_user=None, teiid_password=None, kerberos=False, neo4j_user='neo4j', neo4j_password='neo4j', neo4j_server='localhost', neo4j_scheme='bolt')[source]¶ Scrapes the Koji tables in Teiid.
Query Teiid for all tags a build is tagged in.
-
get_koji_builds
(start_date, end_date)[source]¶ Query Teiid for Koji builds.
- Parameters
start_date (datetime.datetime) – determines when to start the query
end_date (datetime.datetime) – determines until when to scrape data
- Returns
a list of dictionaries
- Return type
Teiid¶
-
class
scrapers.teiid.
Teiid
(host, port, username, password)[source]¶ Abstracts interfacing with Teiid to simplify connections and queries.
-
get_connection
(db_name, force_new=False, retry=None)[source]¶ Return an existing psycopg2 connection and establish it if needed.
- Parameters
- Returns
a connection to Teiid
- Return type
psycopg2 connection
-