Vision

This is functionality for getting visual information into python so our agent can process it.

win32 screen capture


source

win32_view

 win32_view (location:Union[algorithmic_gamer.utility.types.Window,algorit
             hmic_gamer.utility.types.Region], time:bool=False)

Capture a screenshot of the specified window or region on the desktop.

Parameters: location (Window or Region): The window or region to capture.

Returns: np.ndarray: The screenshot as a 3-channel image in RGB format.


source

f_get_region

 f_get_region (location:Union[algorithmic_gamer.utility.types.Window,algor
               ithmic_gamer.utility.types.Region])

Get the relevant handle and the region to be captured.

Parameters: location (Window or Region): The window or region to capture.

Returns: Tuple[int, Region, int, int]: A tuple containing the handle to the window or desktop, the region to be captured, and the width and height of the region.

win32_view Region

w = int(1920/2)
h = int(1080/2)
reg = Region(left=w-(w/2), top=h-(h/2), right=w+(w/2), bottom=h+(h/2))
img = win32_view(location=reg)
print(f'screencapture of desktop at {reg}')
plt.imshow(img)
screencapture of desktop at Region(left=480.0, top=270.0, right=1440.0, bottom=810.0)
<matplotlib.image.AxesImage>

win32_view Window

img = win32_view(location=Window(app_name=fuzzy_app('shannon')))
plt.imshow(img)
<matplotlib.image.AxesImage>

dxcam screen capture

dxcam is a faster alternative win32_view but requires the creation of a camera instance for collecting screenshots, it also only supports Region currently.


source

dxcam_view

 dxcam_view (cam, location:Union[algorithmic_gamer.utility.types.Window,al
             gorithmic_gamer.utility.types.Region], time:bool=False)

This function captures an image from a specified camera view.

Parameters: - cam: The camera object to capture an image from. - location (Union[Window, Region]): The location to capture the image from. Can be a Window or Region object. - time (bool): Whether to return the current UTC time along with the image. Default is False.

Returns: - np.ndarray: The captured image. - datetime: The current UTC time (if time=True)


source

create_dxcam

 create_dxcam ()

This function creates and returns a dxcam object. If dxcam is not found an exception will be raised.

Returns: - dxcam object: The created dxcam object.

w = int(1920/2)
h = int(1080/2)
reg = Region(left=w-(w/2), top=h-(h/2), right=w+(w/2), bottom=h+(h/2))

cam = create_dxcam()
img = dxcam_view(cam, location=reg)
print(f'screencapture of desktop at {reg}')
plt.imshow(img)
screencapture of desktop at Region(left=480.0, top=270.0, right=1440.0, bottom=810.0)
<matplotlib.image.AxesImage>

5.59 ms ± 98.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
52.1 µs ± 326 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

normalization


source

f_naive_normalize_image

 f_naive_normalize_image (tensor)

Normalize a 3D tensor with values in the range [0, 255] to the range [-1, 1]

Args: tensor: A 3D tensor with shape (h, w, 3) and values in the range [0, 255]

Returns: A normalized version of the input tensor with values in the range [-1, 1]

f_naive_normalize_image(img)
array([[[-0.69411765, -0.69411765, -0.67058824],
        [-0.69411765, -0.69411765, -0.67058824],
        [-0.69411765, -0.69411765, -0.67058824],
        ...,
        [-0.69411765, -0.69411765, -0.67058824],
        [-0.69411765, -0.69411765, -0.67058824],
        [-0.69411765, -0.69411765, -0.67058824]],

       [[-0.69411765, -0.69411765, -0.67058824],
        [-0.69411765, -0.69411765, -0.67058824],
        [-0.69411765, -0.69411765, -0.67058824],
        ...,
        [-0.69411765, -0.69411765, -0.67058824],
        [-0.69411765, -0.69411765, -0.67058824],
        [-0.69411765, -0.69411765, -0.67058824]],

       [[-0.69411765, -0.69411765, -0.67058824],
        [-0.69411765, -0.69411765, -0.67058824],
        [-0.69411765, -0.69411765, -0.67058824],
        ...,
        [-0.69411765, -0.69411765, -0.67058824],
        [-0.69411765, -0.69411765, -0.67058824],
        [-0.69411765, -0.69411765, -0.67058824]],

       ...,

       [[-0.69411765, -0.69411765, -0.67058824],
        [-0.69411765, -0.69411765, -0.67058824],
        [-0.69411765, -0.69411765, -0.67058824],
        ...,
        [-0.69411765, -0.69411765, -0.67058824],
        [-0.69411765, -0.69411765, -0.67058824],
        [-0.69411765, -0.69411765, -0.67058824]],

       [[-0.69411765, -0.69411765, -0.67058824],
        [-0.69411765, -0.69411765, -0.67058824],
        [-0.69411765, -0.69411765, -0.67058824],
        ...,
        [-0.69411765, -0.69411765, -0.67058824],
        [-0.69411765, -0.69411765, -0.67058824],
        [-0.69411765, -0.69411765, -0.67058824]],

       [[-0.69411765, -0.69411765, -0.67058824],
        [-0.69411765, -0.69411765, -0.67058824],
        [-0.69411765, -0.69411765, -0.67058824],
        ...,
        [-0.69411765, -0.69411765, -0.67058824],
        [-0.69411765, -0.69411765, -0.67058824],
        [-0.69411765, -0.69411765, -0.67058824]]])

resizing


source

f_resize_image

 f_resize_image (img:numpy.ndarray, w:int, h:int, interpolation:int=3)

Resize the given image to the specified width and height, cropping to maintain the correct aspect ratio if necessary.

Parameters: - img (np.ndarray): The image to resize, represented as a NumPy array. - w (int): The desired width of the resized image. - h (int): The desired height of the resized image. - interpolation (int): The interpolation method to use for resizing. Can be one of 0 (nearest neighbor), 1 (Lanczos), 2 (bilinear), 3 (bicubic), 4 (box), or 5 (Hamming). The default is 3 (bicubic).

Returns: - np.ndarray: The resized image, represented as a NumPy array.

Example: >>> img = np.array([[[255, 0, 0], [0, 255, 0]], [[0, 0, 255], [255, 255, 255]]]) >>> resize_image(img, w=2, h=1).shape (1, 2, 3)

reg = Region(left=900, top=400, right=1600, bottom=1000)
img = f_resize_image(win32_view(location=reg), 300, 300)
img.shape, plt.imshow(img)
((300, 300, 3), <matplotlib.image.AxesImage>)