Workflow¶
We use github flow in the future
To be brief, the only thing you have to know is
@at github: fork the repo
git clone <your repo url>
cd <repo-dir>
## after y do something
git add . # stage all change
git commit -m "<Tell what y have done>" # commit
git push # push it to remote (github)
@at github: submit pull request
Getting Start with Python Locally¶
Virtual Environment (and for package management)¶
Without virtual environment, you are very likely to facing dependency error.
There are Anaconda and venv available. Please, choose one to use. My recommendation is anaconda, but this is just because I am only familiar with it.
There are plenty of tutorials on internet with traditional Chinese. Thus, go through then if u need them.
IDE¶
VSCode is highly recommended, please install those helpful extension for python development. e.g. Pylance, autoDocstring
Typed Python¶
Adding type hint to your code is important for both u and your teammate.
for example
def foo(arg1: int, arg2: str) -> Tuple[int, str]:
do_some_thing ...
return arg1, arg2
please implement all type hint in your code. However, in some case you can just ignore it.
class Foo:
@classmethod
def a_classmethod(cls, arg: int):
# cls is implict with type Foo, no need to specify
pass
def member_func(self, arg: int):
# self is implict with type Foo, no need to specify
do some thing ...
return
def foo(*args, **kwargs):
"""args and kwargs is implicitly the type 'tuple' and 'dict' respectively."""
pass
Type hint is not guaranteed to check by interpreter at runtime. However, it is very important to programmer, since we can therefore get help by Pylance
For example
class Foo:
name: str
"""tell pylance there is an attribute 'name' with type 'str'"""
def __init__(self, name: str):
self.name = name
return
def say_hello(self) -> None:
print(f"hello {self.name}")
foo = Foo("world")
print(foo.name) # pylance mark the 'name' as the color as 'variable'
>>> 'world'
print(foo.naem) # pylance do not mark the word 'naem' which tell you something may wrong.
>>> attribute error
print(foo.name.upper()) # pylance: Oh I know there is a method for the type 'str' called 'upper', hover your mouse on 'upper' to get the docs for upper if u need it.
>>> 'WORLD'
Sometime, there may be some function that do not explicitly tell the return type. However, you can still tell pylance what the type is.
```python output: Tensor = model(inputs)
output is a Tensor no matter how the model(inputs) actually return¶
for the function returning tuple, there is two way to do explicit type hint.
```python
# wrong syntax
a: str, b: int = func_return_str_and_int()
# correct
from typing import Tuple
tem: Tuple[str, int] = func_return_str_and_int()
a, b = tem
# correct and recommanded
a, b = func_return_str_and_int()
a: str # do not thing but tell a is a string
b: int # do not thing but tell b is an int
大原則是讓 pylance 認得所有 literal ,一般而言 (variable, attribute, ...), function, (class and module) 會分別是一個顏色。如果有任何的「白字」(actually depend on theme)那就是代表有東西需要加上 type hint ,或是你寫錯了。
Best Practice¶
Reusing Structure¶
- bad example
def deal_with_history(history): loss = history["train_loss"] # pylance: what is 'loss'? accuracy = history["train_accuracy"] # get attribute error this line pass history = { "train_loss": 1.01, "train_acc": 0.99 } deal_with_history(history)
You would not get or find error until runtime.
- Better Practice by Namespace
from argparse import Namespace
class History(Namespace):
train_loss: float
train_acc: float
def deal_with_history(history: History):
loss = history.train_loss # pylance: yes there is an attribute 'train_loss' for history with type 'float'
accuracy = history.train_accuracy # pylance: what is 'train_accuracy' ?
pass
history = History(train_loss = 1.01, train_acc = 0.99)
deal_with_history(history)
As you can see, the main advantage of using Typed Object (In this case, Namespace) is you can find error right after you type history.train_accuracy
, rather than get an error until runtime.
Another typed structure for tuple is namedtuple
, for dict is TypedDict
.
However, Namespace
and namedtuple
is more recommended.
The reason why TypedDict
is not so recommended is TypedDict is work the same as the normal dict
except for type hint. And as we know that we can only get attribute of value by string from dict, and string sometime suppress warning to typo.
e.g.
try:
from typing import TypedDict # notice that this is for python 3.8 and after
except ImportError:
from typing_extensions import TypedDict # use this for python < 3.8
class Image(TypeDict):
color: str
size: Tuple[int, int]
img = Image(color="blue", size=[100, 100])
print(img["size"]) # pylance: well... yes
print(img["colour"]) # pylance: well... yes (but get runtime error because no attribute called 'colour')
Assertion¶
An example:
def do_something(model: nn.Module, mode: Literal["train", "eval"]):
assert mode in ["train", "eval"], f"expect mode is 'train' or 'eval', got {mode}"
Adding assert
in your code appropriately tell other what exception u have concerned, and help others debug by a comprehensive message, instead of go through all your code.
I wrote this example above with mode: Literal
. However, it may be a better idea to use ENUM. (google python ENUM)
from enum import Enum, auto
class Mode(Enum):
TRAIN = auto()
EVAL = auto()
VALID = EVAL # alias for EVAL
INFERENCE = auto()
mode1 = Mode.TRAIN
mode2 = Mode.EVAL
print(mode1 is Mode.TRAIN)
>>> True
print(mode1 is mode2)
>>> False
print(mode2 is Mode.VALID)
>>> True
def do_something(model: nn.Module, mode: Mode):
assert isinstance(mode, Mode), f"expect mode is an instance of Mode, got {type(mode)} however"
pass
Pylint¶
Pylint is lint for python. (Lint is a rule checker for code consistency.)
Pylint is basic depend on the style guide PEP8
Lint in terminal¶
in terminal, type
pylint <module_name or py_file.py>
pylint well calculate the score of your code. like
Your code has been rated at 10.00/10 (previous run: 10.00/10, +0.00)
If you score is not 10/10, pylint would tell you why and where(in which file which line) make u lose score
for example
************* Module do
do.py:71:0: C0301: Line too long (113/100) (line-too-long)
do.py:18:8: W0612: Unused variable 'cat' (unused-variable)
do.py:25:8: W0612: Unused variable 'cat' (unused-variable)
do.py:33:8: W0612: Unused variable 'cat' (unused-variable)
do.py:50:0: C0413: Import "from nfnet.config import NfnetConfig" should be placed at the top of the module (wrong-import-position)
do.py:51:0: C0413: Import "from nfnet.nfnet_model_utils import NfnetModelUtils" should be placed at the top of the module (wrong-import-position)
do.py:69:8: W0612: Unused variable 'cat' (unused-variable)
do.py:5:0: W0611: Unused FatLeNet5 imported from imgclf.models (unused-import)
do.py:5:0: W0611: Unused FakeVGG16 imported from imgclf.models (unused-import)
do.py:7:0: W0611: Unused Hw2Config imported from hw2.config (unused-import)
do.py:8:0: W0611: Unused AlexNet imported from imgclf.models.models (unused-import)
do.py:8:0: C0412: Imports from package imgclf are not grouped (ungrouped-imports)
------------------------------------------------------------------
Your code has been rated at 8.03/10 (previous run: 8.03/10, +0.00)
We accept only 10/10 score for every commit, and you may not want to solve plenty of issue at once.
Thus, considering enable pylint in vscode.
Enable Pylint in VSCode¶
control
+shift
+p
(cmd
+shift
+p
for Mac) calling command palette- type
python: lint
in the input box - choose
Python: Select Linter
and choosepylint
- redo 1. and 2.
- choose
Python: Enable/Disable Linting
and chooseenable
Handling Pylint Error/Warning/...¶
If you get a message and you don't know how to solve it. Find document for messages here https://pylint.pycqa.org/en/latest/messages/messages_list.html
or just google (or ask me or other teammates).
Common Mistake¶
Naming Style¶
It is important to keep ur naming style consistent, and here is our naming style for different classes.
-
snake_case
- variable
- module
- method
-
PascalCase
- class name
- Type name
-
UPERR_CASE
- constant (In Python constant is not real constant (in-redefinable) however, it is a good practice to name a variable with UPPER_CASE to tell other that variable should be treated as a constant)
Docs¶
Please add doc-string to every methods/function/module/class unless the methods/function/module/class is just a simple wrapping or it is self-explaining enough.
- In Python, we normally use google type docstirngs (you can handle this easily by the extension autoDocstring)
line-to-long¶
Line limit is set to 100, please refactor the long line.
- For long if conditions, use
if (
very_long_variable_name is not None
and very_long_variable_name.field > 0
or very_long_variable_name.is_debug
):
- For long string, use implicit string concatenating syntax
a_long_string = (
"this is a long long sting"
"this is the second line"
"notice that this is all in the same line"
)
for others, use back splash \
def abs(somehow_this_is_a_long_variable: int):
res = somehow_this_is_a_long_variable if\
somehow_this_is_a_long_variable > 0\
else -1 * somehow_this_is_a_long_variable
return res
inconsistent-quotes¶
using double quotes ""
for string, notice that the single quote is still acceptable in the double-quote string.
E.g.
a_str = "this is an inside 'single' quote" # accepted
b_str = 'string with outer single quote' # not accetped
Import outside toplevel¶
The import syntax should be on the top of file. Thus solving path issue by sys.append(...
is not acceptable in modules.
wrong-import-order¶
Import should respect the order standard imports first, then third-party libraries, then local imports
# ok
import os # standard module
import torch # third party module
import .my_module # local module
# wrong
import os
import .my_module
import torch
beside, always use from ___ import ___
if you can
# wrong
import torch.nn as nn
# ok
from torch import nn