Neidio at y cynnwys
Neidio at y fwydlen fordwyo

English

Frequently Asked Questions

You might find an answer to your question here

General Questions

Machine Related Questions

Job Related Questions

Error Messages


General Questions


What is Condor

Condor is a system that allows us to harness the spare computing capacity of a number of desktop computers across campus for useful work when they would otherwise be idle. Users submit their jobs to the Condor system which then places those jobs into a queue, chooses where and when to run them, carefully monitors their progress, and informs the user upon completion.

A Condor job is a command line Windows executable that reads data from one or more input files and writes results to one or more output files. When the user submits their job to Condor they need to tell Condor what to do and how to do it by writing a submit script.

What can Condor do

Condor can run a broad range of applications. To give you an idea of the range of applications Condor can run we have drawn up a list of the popular classes of problems that researchers run on the Condor pool. Please note this list is in no way exhaustive.

Parameter Sweep : A Parameter Sweep problem is one for which one or more parameters sweep from a particular start value to an end value in regular increments, for example: I have P parameters, I need to run P*P*P different jobs. Using Condor I can run N jobs at the same time instead of just one.

Monte Carlo : A Monte Carlo problem problem is one that relies on repeated random sampling to compute a result.  Monte Carlo methods are often used when simulating physical systems such as liquids, disordered materials, strongly coupled solids, and cellular structures, for example: I need to run the same job N times. Using Condor I can run N copies at the same time instead of just one.

Embarrassingly Parallel : An Embarrassingly Parallel program is one for which no particular effort is needed to segment the problem into a very large number of parallel tasks, and there is no communication between those parallel tasks, for example: I can divide my data into N chunks. Using Condor I can run N chunks at the same time instead of just one.

Pattern Matching : Pattern matching is the process of checking for the presence of the constituents of a particular pattern, for example: Find a string of characters, or Find an image in a collection. A number of researchers are using Condor for image processing using in-house codes and systems such as Matlab, Scilab, and ImageMagick. Using Condor a researcher can run N matches at the same time instead of just one.

Who uses Condor

The total number of Condor users currently stands at 37 spread across 13 schools. The majority of those users are research associates or students, however a number of faculty members are also using Condor.

Can I use Condor

If you are a researcher with a Cardiff University user account, and your problem can run on the Condor system, then yes you can. There is currently no charge for the Condor service however we request that you write a paragraph or two about your research and send it to us so that we may keep a record of the kind of problems our researchers are running on the Condor pool.

How does Condor work

Condor works on a principle called matchmaking. All machines in the Condor pool, Execute and Submit nodes alike, send regular updates to the Central Manager node. Those updates, which are called ClassAds, contain capability and status information about the node. When a user submits a job from their Submit node a ClassAd containing job requirements is sent to the Central Manager.  The Central Manager responds with a list of machines that match those requirements and the Submit machine then contacts the appropriate Execute machine to run the job.

How much does Condor cost

There is currently no charge for the Condor service

Where is the Condor manual

You can find the manual here

Where are the Condor training materials

You can find the training materials here


Machine Related Questions


How many machines are in the Condor pool

As of this writing there are about 1000 machines in the Condor pool.

How much memory is installed in machines in the Condor pool

As of this writing about 19% of the machines have less than 512 Mbytes of memory, 28% of the machines have between 512 Mbytes and 1024 Mbytes of memory, 50% of the machines have between 1024 Mbytes and 2048 Mbytes of memory, and 3% of the machines have more than 2048Mbytes of memory installed.

What is the speed of the network connection to the majority of execute nodes

Over 75% of the machines on campus are connected at 100MBit and have been migrated to a layer 3 network. The remaining 25% of machines are connected at 10MBit and are due to be migrated to a layer 3 network over the coming months.

What is the speed of the network connection to the majority of submit nodes

100MBit

What is the speed of the network connection to the central managers

1000MBit

How do I find out the IP address of my machine

Click on Start > Programs > Accessories > Command Prompt and type the following at the prompt: ipconfig. The IP address of the central manager is 131.251.4.206. The IP address of a machine on the layer 2 network will look something like 131.251.*.*. The IP address of a machine on a layer 3 subnet will look something like 10.*.*.*. The stars represent numbers in the range 0 to 254.

How do I find out the MAC address of my machine

Click on Start > Programs > Accessories > Command Prompt and type the following command at the prompt: ipconfig -all. The MAC address of the central manager is 00-0D-56-FE-80-B1. The MAC address is the same as the Physical Address.

How do I find out the HOSTNAME of my machine

Click on Start > Programs > Accessories > Command Prompt and type the following command at the prompt: hostname. The HOSTNAME of the central manager is condorman. The HOSTNAME of a machine on the layer 2 or layer 3 network will look something like X000D56FE80B1. This will give you the HOSTNAME used in eDirectory to identify a particular workstation. 

Click on Start > Programs > Accessories > Command Prompt and type the following command at the prompt: nslookup 131.251.4.206. The previous command shows you how to find out the HOSTNAME of the machine with IP address 131.251.4.206 as stored in Socket to identify a particular workstation. 

The HOSTNAME of a machine on the layer 2 or layer 3 network that can submit Condor jobs will look something like *.condor.cf.ac.uk or *.school.condor.cf.ac.uk. The stars represent a sequence of alphanumeric characters.


Job Related Questions


Can I run commercial applications on the Condor pool

It depends on the license.

First we must make sure that the legal requirements of the license are met.

Second we must make sure that the technical requirements of the licensing process are met. If the application requires access to a hardware dongle to run then it will not be possible to run the application on the Condor pool (unless you have a large number of dongles). If the application requires access to a license server to run then it should be possible to run the application on the Condor pool (if you have a large number of licenses). If the application does not require access to a hardware dongle or a license server then it should be possible to run the application.

Can I run open source applications on the Condor pool

Yes.

Can I run applications compiled from source code on the Condor pool

Yes.

Can I submit a job that runs a number of commands

Yes.

You do this by writing a batch file that contains a number of commands and telling Condor to run the batch file instead of a regular executable file. Batch files can contain commands to mount network shares, download input data files from the share, run one or more executables, upload output files to the share, and unmount network shares.

Can I submit a job that mounts a network share

Yes.

You do this by writing a batch file similar to this one. Lines beginning with rem are remarks and not batch file commands

rem mount the network share
net use \\computername\sharename\
rem download input file
copy \\computername\sharename\input*.dat .
rem input0.dat, input1.dat and input3.dat are now downloaded
rem run one or more executables
program1.exe
program2.exe
rem upload output file
copy output.dat \\computername\sharename\
rem output.dat is now uploaded
rem unmount the network share
net use \\computername\sharename\ /delete

For further information see http://www.computerhope.com/nethlp.htm

Can I target my jobs to run on a particular machine

Yes.

If you want to target your jobs to run on a particular machine you will have to add the following to the requirements line in your submit script: requirements = MACHINE==” X001122334455.CF.AC.UK”

You should replace X001122334455 in the previous example with the name of the machine you want to target.

Can I target my jobs to run on machines donated by a particular school

Yes.

If you are logged into a submit node you can find out which machines in the Condor pool belong to ASCHOOL using the following command at the prompt: condor_status -constraint IS_OWNED_BY==\"ASCHOOL\"

If you want to target your jobs to run on machines donated by ASCHOOL you will have to add the following to the requirements line in your submit script: requirements = IS_OWNED_BY==”ASCHOOL”

You should replace ASCHOOL in the previous examples wit the name of a particular school you want to target.

As of this writing the following schools have donated some machines: BIOSI, CARBS, CLAWS, ENGIN, MATHS, OPTOM, and SOCSI.

Can I run Blast jobs on the Condor pool

No.

Can I run Matlab jobs on the Condor pool

Yes. Versions 74, 78, and 710 of the Matlab Compiler Runtime are installed on a number of machines in the Condor pool.

Version 74 : The path to version 74 of the runtime on those machines is C:\Condor\opt\matlab\v74\runtime\win32. If you are logged into a submit node you can find out which machines in the Condor pool have version 74 of the runtime installed using the following command at the prompt: condor_status -constraint "HAS_MATLAB_V74". If you want to submit a Matlab job you will have to compile your Matlab code into an executable and add the following to the requirements line in your submit script: requirements = HAS_MATLAB_V74 == TRUE.  You should also run your Matlab executable from a batch file that first sets the path variable correctly before running your executable e.g.

path=%path%;C:\Condor\opt\matlab\v74\runtime\win32
myexecutable.exe

Version 78 : The path to version 78 of the runtime on those machines is C:\Condor\opt\matlab\v78\runtime\win32. If you are logged into a submit node you can find out which machines in the Condor pool have version 78 of the runtime installed using the following command at the prompt: condor_status -constraint "HAS_MATLAB_V78". If you want to submit a Matlab job you will have to compile your Matlab code into an executable and add the following to the requirements line in your submit script: requirements = HAS_MATLAB_V78 == TRUE.  You should also run your Matlab executable from a batch file that first sets the path variable correctly before running your executable e.g.

path=%path%;C:\Condor\opt\matlab\v78\runtime\win32
myexecutable.exe

Version 710 : The path to version 710 of the runtime on those machines is C:\Condor\opt\matlab\v710\runtime\win32. If you are logged into a submit node you can find out which machines in the Condor pool have version 710 of the runtime installed using the following command at the prompt: condor_status -constraint "HAS_MATLAB_V710". If you want to submit a Matlab job you will have to compile your Matlab code into an executable and add the following to the requirements line in your submit script: requirements = HAS_MATLAB_V710 == TRUE.  You should also run your Matlab executable from a batch file that first sets the path variable correctly before running your executable e.g.

path=%path%;C:\Condor\opt\matlab\v710\runtime\win32
myexecutable.exe

Can I run Perl jobs on the Condor pool

Yes. Version 5.10.0.1004 of Perl is installed on a number of machines in the Condor pool.

The path to perl.exe on those machines is C:\Condor\opt\perl\5.10.0.1004\bin\perl.exe. If you are logged into a submit node you can find out which machines in the Condor pool have Perl installed using the following command at the prompt: condor_status -constraint "HAS_PERL_5_10_0_1004". If you want to submit a Perl job you will have to add the following to the requirements line in your submit script: requirements = HAS_PERL_5_10_0_1004 == TRUE.  You should also run your Perl script from a batch file that first sets the path variable correctly before running your executable e.g.

path=%path%;C:\Condor\opt\perl\5.10.0.1004\bin
perl.exe myscript.pl

Can I run Python jobs on the Condor pool

Yes. Version 2.6.1.1 of Python is installed on a number of machines in the Condor pool.

The path to python.exe on those machines is C:\Condor\opt\python\2.6.1.1\python.exe. If you are logged into a submit node you can find out which machines in the Condor pool have Python installed using the following command at the prompt: condor_status -constraint "HAS_PYTHON_2_6_1_1". If you want to submit a Python job you will have to add the following to the requirements line in your submit script: requirements = HAS_PYTHON_2_6_1_1 == TRUE.  You should also run your Python script from a batch file that first sets the path variable correctly before running your executable e.g.

path=%path%;C:\Condor\opt\python\2.6.1.1
python.exe myscript.py

Can I run R jobs on the Condor pool

Yes. Version 2.8.1 of R is installed on a number of machines in the Condor pool.

The path to r.exe on those machines is C:\Condor\opt\r\2.8.1\bin\r.exe. If you are logged into a submit node you can find out which machines in the Condor pool have R installed using the following command at the prompt: condor_status -constraint "HAS_R_2_8_1". If you want to submit an R job you will have to add the following to the requirements line in your submit script: requirements = HAS_R_2_8_1 == TRUE.  You should also run your R script from a batch file that first sets the path variable correctly before running your executable e.g.

path=%path%;C:\Condor\opt\r\2.8.1\bin
r.exe myscript.r

Which R packages are provided in the local installation

 

Local R Packages [118.9 Kb]

 

What is the maximum practical job length

Maximum practical run time per job is 60 minutes therefore you should aim to divide your work up into multiple jobs each of which takes no longer than 60 minutes to run on a typical desktop computer which, as of this writing, is a computer with a 3.0GHz Pentium IV processor and 2GB of memory.

What is the maximum practical job size

Maximum practical data size per job is 50 megabytes input or output therefore you should aim to divide your input data up into multiple chunks each of which is no larger than 50 megabytes and generate no more than 50 megabytes of output data per job.


Error Messages


When installing Condor I get the error "ERROR: The process "condor_*.exe" not found"

This error appears only during installation and can be ignored

When running condor_* I get the error "'condor_*' is not recognized as an internal or external command..."

This error appears when the Condor programs have not been added to the path variable of your current command prompt.  

To run a command prompt with the path variable set correctly click on Start > Networked Applications > Departmental Software > ARCCA > Condor > Condor Prompt.

When running condor_store_cred add I get the error "Operation failed: bad password"

This error appears when the password entered does not match your Novell password.

Run condor_store_cred add again and this time provide your Novell password.

When running condor_submit I get the error "No credential stored for user@hostname"

This error appears when you have not already stored a user credential

Run condor_store_cred add

I get the error "You are running out of disk space on C"

If you do not submit jobs to the Condor pool then the space occupied by the optional installation packages on your machine is too great. Please contact us via the helpdesk and ask us to give you access to the "Remove Condor" application object. Please note that you will have to tell us the MAC address of your machine so that we can give you access to the "Remove Condor" application object and delete the association with the appropriate "Install Condor" application object so that this does not occur in the future e.g. as Condor is upgraded.

If you submit jobs to the Condor pool then there are two possibilities (1) the space occupied by the optional installation packages on your machine is too great, or (2) the space occupied by the results of your jobs on your machine is too great. Please contact us via the helpdesk to discuss how we can prune your C:\Condor directory.