AttributeError: ‘NoneType’ object has no attribute ‘is_finalizing’ when using Dask and distributed.Future

I am currently working on a project that involves using Dask for parallel and distributed computing. I’ve been encountering an issue related to the distributed library, specifically when working with distributed.Future objects. Here’s the error message I’m getting:

AttributeError: 'NoneType' object has no attribute 'is_finalizing'

This error occurs when I try to access the is_finalizing attribute on a distributed.Future object. It seems to suggest that the object is None, but I’m not sure why this is happening.

I’ve double-checked my code and the way I’m creating and using Future objects, and it seems correct to me. However, I’m still facing this error.

Here’s a simplified example of how I’m creating a Future:

# Extraction function
def extract_chunk(df): # raise_chrome() is working fine
    browser = raise_chrome(headless=True)
    list_dicts = list()

    for i in df.index: 
        list_product_dicts = secure_scrape(browser, df.loc[i])

        if list_product_dicts:   
            list_dicts.extend(list_product_dicts)

    browser.quit()
    return list_dicts

# Chunk our dataframe, where each chunk is a range of our base df
chunked_dfs = [df_base.loc[chunk].copy(deep=True) for chunk in chunks]

# Create a LocalCluster with x workers and 1 thread per worker
with LocalCluster(n_workers=x, threads_per_worker=1) as cluster:
    # Create a Dask Client connected to the cluster
    with Client(cluster) as client:
        print(client)

        futures = [client.submit(extract_chunk, df) for df in chunked_dfs]
        dask_dicts = client.gather(futures)

# Data transformation
flatten_list = [item for sublist in dask_dicts for item in sublist]
df_dask = pd.DataFrame(flatten_list)

I’ve also made sure that I’m using the latest versions of Dask and the distributed library.

I’m hoping to get some guidance on what might be causing this error and how to fix it. Is there something I’m missing or a common mistake I should be aware of when working with distributed.Future objects in Dask?

Environment:

  • Python version: [Python 3.10]
  • Dask version: [Dask 2023.9.1]
  • Distributed version: [distributed 2023.9.1]
  • Operating system: [Ubuntu 22.04.2 LTS running as a WSL2 machine]

  • Please post the whole traceback. The last line is not enough to determine what the problem is. Also, would it be possible to simplify your example even further?

    – 

Leave a Comment