yapapi
- Golem's high-level API - and its dependencies in the requestor agent, of course).workshop
branch:workshop
branch contains a template for the application with some boilerplate filled in for you. If you'd like to take a look at the finished implementation instead, please use the repo's master
branch.worker.py
file from the workshop
branch, you'll see the following boilerplate:HASH_PATH
, WORDS_PATH
and RESULT_PATH
- those are the paths to the locations within the Docker image that contain the hash to be cracked, the slice of the dictionary we want this node to process and finally, the path to the result file - in case a result is found within the processed slice of the dictionary.Path
object:TODO
:line_bytes
variable),line_hash
variable),sha256
function from the hashlib
library (bundled with Python), we need to import it by adding a line to our imports at the top of the file:data/words-short.json
), which is also included in our example alongside with a sample hash derived from one of the words in that shorter list (data/hash-short.json
). The hash should match the word test
from that list.data/words-short.json
) is a JSON file as this is the format which our worker.py
script expects. It corresponds to a single slice of the original word list.worker.py
's input paths. Let's replace the constants in the beginning of the file to point to our shorter lists:worker.py
script (needs to be executed from the project's root directory):"test"
which matches the expected password as mentioned above.worker.py
script ready, it's time to take a look at the VM image which will be used to run our code on providers.Dockerfile
looks like:python
image since we want it to run our worker.py
script and choose the slim
variant to reduce the image's size./golem/input
and /golem/output
. Volumes are directories that can be shared with the host machine and, more importantly, through which the execution environment supervisor (the process on the provider's host machine) will be able to transfer data to and out of the VM. For a Golem VM, the image must define at least one volume.worker.py
script to the path /golem/entrypoint
within the image. Later on we'll see how the requestor code uses this path to run our script.worker.py
above) in the image itself. Instead, one can push it to individual providers at runtime using the work context's .run()
command./golem/entrypoint
as the working directory of the image. It will be the default location for commands executed by this image.gvmkit-build
tool - the WORKDIR
doesn't need to be present, in which case the working directory will be set to /
and the paths to the binaries run will need to be absolute.ENTRYPOINT
statement - if present in your Dockerfile - is effectively ignored and replaced with the exeunit's own entrypoint.Dockerfile
..gvmi
file using gvmkit-build
..gvmi
file to Golem's image repository.gvmkit-build
is included in requirements.txt
, so it should be installed in the virtual environment used for this example.--push
option needs to be a discrete step.yagna
daemon and handled by our high-level API via the daemon's REST API.Task
objects that directly represent the singular jobs that are given to provider nodes.requestor.py
file.requestor.py
from the workshop
branch, you'll see the following boilerplate:hash
and words
. These are paths to files containing the hash we're looking for and the dictionary which we hope to find the hash in, respectively. Otherwise, it's just a regular Python argument parser invocation using argparse
.data
and steps
- the filling of which will be our main task in this section.main
which we will also need to supplement with a proper call to our API's Golem
class to bind the previous two together.main
routine and does some rudimentary error handling just in case something goes amiss and we're forced to abort our task with a somewhat rude Ctrl-C.data
function. It accepts the words_file
path and the chunk_size
, which is the size of each dictionary slice defined by its line count. data
function produces a generator yielding Task
objects that describe each task fragment.chunk
) which it fills with the lines from said file, stripping them of any preceding or trailing whitespace or newline characters (line.strip()
).chunk_size
- or once all lines have been read from the input file - it then yields the respective Task
with its data
set to the just-constructed list.Task
.steps
in our example. It accepts context
, which is a WorkContext
instance and tasks
- an iterable of Tasks
which will be filled with task fragments coming from our data
function that we defined in the previous step.WorkContext
gives us a simple interface to construct a script that translates directly to commands interacting with the execution unit on provider's end. Each such work context refers to one activity started on one provider node. While constructing such a script, we can define those steps that need to happen once per a worker run (in other words, once per provider node) - those are placed outside of the loop iterating over tasks
..send_file()
invocation. It transfers the file containing the hash we'd like to crack and instructs the execution unit to store it under worker.HASH_PATH
, which is a location within the VM container that we had previously defined in our worker.py
script. We perform this step just once here because that piece of task input doesn't change..send_json()
which tells the exe-unit to store the given subset of words as a JSON-serialized file in another path within the VM that we had defined in worker.py
(worker.WORDS_PATH
, note that in this function the destination comes first, followed by an object to be serialized),.run()
call which is the one that actually executes the worker.py
script inside the provider's VM, which in turn produces output (as you remember, this may be empty or may contain our solution),.download_file()
call which transfers that solution file back to a temporary file on the requestor's end,.run()
call to the VM execution unit must directly refer to a given executable, which usually means specifying their full, absolute path. There's no shell (and hence, no PATH) there to rely upon..commit()
on our work context and yield that to the calling code (the processing inside the Golem
class) which takes our script and orchestrates its execution on provider's end.steps
function, the task
has already been completed. Now, we only need to call Task.accept_result()
with the result coming from the temporary file transferred from the provider. This ensures that the result is what's yielded from the Golem
to the final loop in our main
function that we'll define next.main
function in the boilerplate.vm.repo()
invocation with the noted-down one:Golem
engine. It is given our GLM budget
and the subnet_tag
- a subnet identifier for the collection of nodes that we want to utilize to run our tasks - unless you know what you're doing, you're better-off leaving this at the value defined as the default parameter in our boilerplate code.golem
is used with async with
as an asynchronous context manager. This guarantees that all internal mechanisms the engine needs for computing our tasks are started before the code in the body of async with
is executed, and are properly shut down afterwards.golem
started, we are ready to call its execute_tasks
method. Here we instruct golem
to use the steps
function for producing commands for each task, and the iterator produced by data(args.words)
to provide the tasks themselves. We also tell it that the provider nodes need to use the payload
specified by the package
we defined above. And finally, there's the timeout
in which we expect the whole processing on all nodes to have finished.async for
we iterate over tasks computed by execute_tasks
and check their results. As soon as we encounter a task with task.result
set to a non-empty string we can break
from the loop instead of waiting until the remaining tasks are computed.result
should contain our solution and the solution is printed to your console. (Unless of course it happens that the hash we're trying to break is not found within the dictionary that we have initially assumed it would come from - which we assure you is not the case for our example hash ;) ).yagna
daemon itself.yagna
deamon. Please go to:requestor.py
(in the image_hash
parameter to vm.repo()
),yagna
daemon is running and is properly funded and initialized as a requestor.hash-cracker
app installedrequestor.py
script within the checked-out repo and run: