Click on characters below to create text in the box below that, then copy & paste to your content.
Phoneme bank:
Font list:
Custom font:
Size:
Rows:
Add codepoint:
Search for:
Normalise: NFC
Autofocus: On
Notes:
You must have JavaScript enabled. Choose a view from the list just below the page title. To produce text in the output area, click on character shapes, or use your keyboard for Latin characters, delete, etc. Then cut & paste the result to your document, or use the buttons to get further information about the characters.
You can also add codepoints and escapes via the "Add codepoint" field (hit return to add to the output field). You can also paste text into the output field to get information about it. Use the yellow boxes to set preferences or search. Regular expressions are allowed when searching – for example, to find characters with the word KA in their name, enter \bka\b, or the short form :ka:.
About the chart
Includes characters in the Unicode Arabic block used for Urdu. Most of the characters in the Urdu standard UZT 1.01 are included.
All text is output in Unicode normalisation form NFC by default. You can change to NFD or no normalisation by clicking on the buttons in the yellow area. Note that normalization only takes place when you click on a character - text pasted into the box won't be normalised until you click on another character above, or click on a button in the yellow area. (Note: normalization is turned off for Han characters in this application.)
Alternative views
The following alternative views are available by clicking just below the page title. You can start up directly in one of the views by appending the following to your URI: ?view=, followed by one of, respectively, default, shape, transcription, phones or fontgrid.
Default This view is likely to be more useful to people who are somewhat familiar with the alphabet and characters of Urdu. Characters are arranged based on the use of the script in order to speed up picking.
Top left are the letters of the alphabet, in the standard order. Other commonly used letters appear below. To their right, from top to bottom, we have combining vowel characters, symbols, digits and punctuation. There are also a number of Unicode control characters for managing direction of text. (Note that for HTML you should use markup, where possible, rather than RLE, LRE, PDF, RLO and LRO.)
Shape This view is purely based around shape, and is therefore good when you don't know the script well at all, or for shapes you don't know. Characters are grouped and ordered by visual similarity. It is, however, very difficult to provide adequate shape-based lookup for cursive nastaliq and non-nastaliq text, and this view doesn't try to be exhaustive.
Each orange key near the top of the page represents a significant part of the shape of two or more characters; as you mouse over the keys, characters and combinations of characters that incorporate that shape are displayed below. Click on these characters to add them to the output. Within a group I attempted to put easily confusable characters close to each other.
The shapes grouped under 'Other' are a mixed bag of characters that didn't fit elsewhere.
The last orange shape to the right shows all combining characters. The shapes just to the left of that group characters by the dots or other marks that appear around them. (This may help in transcribing some of the cursive forms.)
An orange plus sign to the right of a set of shapes is followed by characters that don't quite fit the current shape group, but may cause confusion because they share elements, or because their shape may be similar, though not quite the same.
Transcription I use this for typing in text for which I have a transcription, or for creating phonemic transcriptions.
The large characters on a grey background represent characters used for Urdu transcription in both the Library of Congress scheme and Teach Yourself Urdu by David Matthews and Mohamed Kasim Dalvi. To type Urdu text starting from a transcription, click on these characters. If there is only one Urdu character corresponding to the transcription letter, it is inserted directly into the output field. If there are multiple alternatives, these are presented to you in a selection list: click on the Urdu character you need in the selection list and it is added to the output.
Each Urdu character is associated with a phonetic symbol (a Latin/IPA symbol on white background to its left in the selection lists). If there is more than one possible phonic representation you will see the selection list divided appropriately. In some cases a Urdu character is repeated within the same selection list because it has more than one possible phonetic equivalent - in such cases, choose the right one if you want to generate this phonetic transcription. As you select characters, the phonetic symbol to its left is added to the Phoneme bank area, below the output area.It is quite basic, but is offered as a way of speeding up text entry where you want to type both the Urdu characters and the phonemic transcription. You can edit the text in the phoneme bank, if you wish, and you can move it into the main output area at the current cursor position by clicking on Add.
The vowels to the left of the first line of vowels produce no output, but allow you to capture phones for creating a phonemic transcription. Vowels to their right list non-pointed characters. The vowels on the line below are for creating fully-vowelled text. Often these will need to be used with alef, he or ayin - in which case, look for those characters under 'silent' in the list of consonants above.
There are two selectors for shadda/tašdīd. One inserts a tašdīd in the Urdu text and doubles the last character in the phoneme buffer; the other just doubles the character at the end of the phoneme buffer. There is no mechanism to automatically deal with Arabic definite article pronunciation, where the first sound of the following word is doubled. You will need to manually work with that.
For less common characters, switch to the Alphabetic view.
As you mouse over the Latin characters on the grey background, the corresponding Urdu characters are also displayed near the top of the page. This is to aid in searching.
Transcription > Latin This represents the union of all transcription and phonetic characters, and is provided in case you wish to just type in a transcription directly.
Font grid Shows characters in Unicode order, using whatever font is specified in the Font list or Custom font input fields. This allows comparison of fonts (especially useful in IE, which shows if a glyph is missing from a font).
Other features
For further information about features of the tool or user interface, see How to use..
Useful URIs
Urdu lite, a cut-down version of this picker for handheld devices.