API Reference

This section provides documentation for the main APCloudy classes and methods.

APCloudyClient

class apcloudy.APCloudyClient(api_key: str = '', settings=None)[source]

Bases: object

Represents a client for interacting with the APCloudy API.

This class provides methods to make authenticated HTTP requests, manage projects (e.g., retrieval, creation, and listing), and validate the connection with the APCloudy API. The client supports additional features such as retry logic for transient errors and rate-limiting compliance.

Variables:

api_key – The API key is used for authentication with the APCloudy API.
base_url – The base URL for the APCloudy API.
session – The session object used to handle HTTP requests.

__init__(api_key: str = '', settings=None)[source]

Initialize APCloudy client

Parameters:

api_key – Your APCloudy API key
settings – Scrapy settings object (optional)

http_request(method: str, endpoint: str, **kwargs) → Dict[str, Any][source]

Make API request with retry logic

Parameters:

method – HTTP method
endpoint – API endpoint
**kwargs – Additional arguments for requests

Returns:

API response data

Return type:

Dict

get_project(project_id: int) → ProjectManager[source]

Get a project manager for the specified project

Parameters:: project_id – Project ID
Returns:: Project manager instance
Return type:: ProjectManager

list_projects() → Project[source]

List all projects

Returns:: Available projects
Return type:: List[Project]

create_project(name: str, description: str = '') → Project[source]

Create a new project

Parameters:

name – Project name
description – Project description

Returns:

Created project

Return type:

Project

Main Methods

Project Operations

get_projects() - List all projects
get_project(project_id) - Get a specific project

Spider Operations

get_spiders(project_id) - List spiders in a project
get_spider(project_id, spider_name) - Get a specific spider

Job Operations

get_jobs(project_id) - List jobs for a project
get_job(job_id) - Get job details
start_job(project_id, spider_name) - Start a new job
stop_job(job_id) - Stop a running job

Models

Job State

class apcloudy.models.JobState(value)[source]

Represents the state of a job in a task or workflow management system.

This enumeration is used to define and manage the state of a job. It supports several states a job may transition through during its lifecycle, which can facilitate job tracking, control, and monitoring in various systems. Typical states include when a job is scheduled, actively running, completed, or deleted.

SCHEDULED = 'scheduled'

RUNNING = 'running'

COMPLETED = 'completed'

DELETED = 'deleted'

Job

class apcloudy.models.Job(job_id: str, spider_name: str, state: ~apcloudy.models.JobState, project_id: str = '', created_at: ~datetime.datetime | None = None, started_at: ~datetime.datetime | None = None, finished_at: ~datetime.datetime | None = None, items_scraped: int = 0, requests_made: int = 0, job_args: ~typing.Dict[str, ~typing.Any] = <factory>, units: int = 1, logs_url: str | None = None, items_url: str | None = None)[source]

Represents a job execution and maintains information related to job lifecycle, metrics, and associated resources.

This class is used for tracking the progress, state, and details of a specific job. It can manage metadata such as creation time, start time, finish time, and other attributes that describe the job’s execution process.

Variables:

job_id – Unique identifier for the job assigned by the system.
spider_name – Name of the spider used to execute the job.
state – Current state of the job, represented as a JobState instance.
project_id – Identifier of the project the job belongs to.
created_at – Timestamp when the job was created, or None if not available.
started_at – Timestamp when the job was started, or None if not available.
finished_at – Timestamp when the job was finished, or None if not available.
items_scraped – Total number of items successfully scraped by the job.
requests_made – Total number of requests made during the job execution.
job_args – Dictionary of additional arguments or configuration parameters passed to the job.
units – Number of resource units used (e.g., processing capacity) by the job.
logs_url – URL containing logs associated with the job, or None if not set.
items_url – URL containing scraped items for the job, or None if not set.

job_id: str

spider_name: str

state: JobState

project_id: str = ''

created_at: datetime | None = None

started_at: datetime | None = None

finished_at: datetime | None = None

items_scraped: int = 0

requests_made: int = 0

job_args: Dict[str, Any]

units: int = 1

logs_url: str | None = None

items_url: str | None = None

classmethod from_dict(data: List[Dict[str, Any]]) → List[Job][source]

Creates a Job instance from a dictionary representation and displays the job details in a tabulated format.

This method is primarily responsible for deserializing structured data into a Job instance and setting attributes accordingly. Additionally, it formats and prints job details like job ID, spider name, state, and timestamps in an organized layout.

Parameters:: data (Dict[str, Any]) – Dictionary containing the job data.
Returns:: A Job instance populated from the given data.
Return type:: Job

property duration: float | None: Get job duration in seconds

__init__(job_id: str, spider_name: str, state: ~apcloudy.models.JobState, project_id: str = '', created_at: ~datetime.datetime | None = None, started_at: ~datetime.datetime | None = None, finished_at: ~datetime.datetime | None = None, items_scraped: int = 0, requests_made: int = 0, job_args: ~typing.Dict[str, ~typing.Any] = <factory>, units: int = 1, logs_url: str | None = None, items_url: str | None = None) → None

Spider

class apcloudy.models.Spider(name: str, description: str = '', project_id: str = '', settings: ~typing.Dict[str, ~typing.Any] = <factory>)[source]

Represents a spider

name: str

description: str = ''

project_id: str = ''

settings: Dict[str, Any]

classmethod from_dict(data: List[Dict[str, Any]]) → List[Spider][source]: Create Spider instance from API response

__init__(name: str, description: str = '', project_id: str = '', settings: ~typing.Dict[str, ~typing.Any] = <factory>) → None

Project

class apcloudy.models.Project(project_id: str, org_name: str, name: str, description: str = '', created_at: datetime | None = None, spider_count: int = 0, job_count: int = 0)[source]

Represents a project

project_id: str

org_name: str

name: str

description: str = ''

created_at: datetime | None = None

spider_count: int = 0

job_count: int = 0

classmethod from_dict(data: Dict[str, Any]) → Project[source]: Create Project instance from API response

__init__(project_id: str, org_name: str, name: str, description: str = '', created_at: datetime | None = None, spider_count: int = 0, job_count: int = 0) → None

Exceptions

APCloudy specific exceptions

exception apcloudy.exceptions.APCloudyException[source]: Base exception for APCloudy operations

exception apcloudy.exceptions.APIError(message: str, status_code: int | None = None, response_data: dict | None = None)[source]

Raised when API returns an error response

__init__(message: str, status_code: int | None = None, response_data: dict | None = None)[source]

exception apcloudy.exceptions.AuthenticationError(message: str, status_code: int | None = None, response_data: dict | None = None)[source]: Raised when authentication fails

exception apcloudy.exceptions.JobNotFoundError(message: str, status_code: int | None = None, response_data: dict | None = None)[source]: Raised when a job is not found

exception apcloudy.exceptions.ProjectNotFoundError(message: str, status_code: int | None = None, response_data: dict | None = None)[source]: Raised when a project is not found

exception apcloudy.exceptions.SpiderNotFoundError(message: str, status_code: int | None = None, response_data: dict | None = None)[source]: Raised when a spider is not found

exception apcloudy.exceptions.RateLimitError(message: str, status_code: int | None = None, response_data: dict | None = None)[source]: Raised when rate limit is exceeded